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MKLCOMK! 


This book is designed to help you learn the major concepts of single-variable 
calculus, while also concentrating on problem-solving techniques. Whether 
this is your first exposure to calculus, or you are studying for a test, or you’ve 
already taken calculus and want to refresh your memory, I hope that this book 
will be a useful resource. 

The inspiration for this book came from my students at Princeton Univer¬ 
sity. Over the past few years, they have found early drafts to be helpful as a 
study guide in conjunction with lectures, review sessions and their textbook. 
Here are some of the questions that they’ve asked along the way, which you 
might also be inclined to ask: 

• Why is this book so long? I assume that you, the reader, are moti¬ 
vated to the extent that you’d like to master the subject. Not wanting 
to get by with the bare minimum, you’re prepared to put in some time 
and effort reading — and understanding — these detailed explanations. 

• What do I need to know before I start reading? You need to 
know some basic algebra and how to solve simple equations. Most of 
the precalculus you need is covered in the first two chapters. 

• Help! The final is in one week, and I don’t know anything! 
Where do I start? The next three pages describe how to use this 
book to study for an exam. 

• Where are all the worked solutions to examples? All I see is 
lots of words with a few equations. Looking at a worked solution 
doesn’t tell you how to think of it in the first place. So, I usually try to 
give a sort of “inner monologue” 一 what should be going through your 
head as you try to solve the problem. You end up with all the pieces of 
the solution, but you still need to write it up properly. My advice is to 
read the solution, then come back later and try to work it out again by 
yourself. 

• Where are the proofs of the theorems? Most of the theorems in 
this book are justified in some way. More formal proofs can be found in 
Appendix A. 

• The topics are out of order! What do I do? There’s no standard 
order for learning calculus. The order I have chosen works, but you might 
have to search the table of contents to find the topics you need and ignore 
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the rest for now. I may also have missed out some topics too — why not 
try emailing me at adrian@calclifesaver.com and you never know, I just 
might write an extra section or chapter for you (and for the next edition, 
if there is one!). 

• Some of the methods you use are different from the methods 

I learned. Who is right —— my instructor or you? Hopefully we’re 

both right! If in doubt, ask your instructor what’s acceptable. 

• Where’s all the calculus history and fun facts in the margins? 
Look, there’s a little bit of history in this book, but let’s not get too 
distracted here. After you get this stuff down, read a book on the 
history of calculus. It’s interesting stuff, and deserves more attention 
than a couple of sentences here and there. 

• Could my school use this book as a textbook? Paired with a 
good collection of exercises, this book could function as a textbook, as 
well as being a study guide. Your instructor might also find the book 
useful to help prepare lectures, particularly in regard to problem-solving 
techniques. 

• What’s with these videos? You can find videos of a year’s supply of 
my review sessions, which reference a lot (but not all!) of the sections 
and examples from this book, at this website: 

I www.calclifesaver.com | 


Howft3 Use This Book 猶多 tudy for an Exam 

There’s a good chance you have a test or exam coming up soon. I am sympa¬ 
thetic to your plight: you don’t have time to read the whole book! There’s a 
table on the next page that identifies the main sections that will help you to 
review for the exam. Also, throughout the book, the following icons appear 
in the margin to allow you quickly to identify what’s relevant: 

• A worked-out example begins on this line. 


• Here’s something really important. 


• You should try this yourself. 


• Beware: this part of the text is mostly for interest. If time is limited, 
skip to the next section. 



Also, some important formulas or theorems 
have boxes around them: learn these well. 
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a/O all-purpos© study tips 

• Write out your own summary of all the important points and formul 
memorize. Math isn’t about memorization, but there are some key forr 
and methods that you should have at your fingertips. The act of makin 
summary is often enough to solidify your understanding. This is the 
reason why I don’t summarize the important points at the end of a cha 
it’s much more valuable if you do it yourself. 

• Try to get your hands on similar exams — maybe your school makes pre 
years’ finals available, for example — and take these exams under proper 
ditions. That means no breaks, no food, no books, no phone calls, no er 
no messaging, and so on. Then see if you can get a solution key and gra 
or ask someone (nicely!) to grade it for you. 

You’ll be on your way to that A if you do both of these things. 


ey exam review (by?tppi£^. 
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Topic 

Subtopic 

Section(s) 

Applications of 

Related rates 

8.2 

differentiation 

Exponential growth and decay 

9.6 


Finding global maxima and minima 

11.1.3 


Rolle’s Theorem/Mean Value Theorem 

11.2, 11.3 


Classifying critical points 

11.5, 12.1.1 


Finding inflection points 

11.4, 12.1.2 


Sketching graphs 

12.2, 12.3 


Optimization 

13.1 


Linearization/differentials 

13.2 


Newton’s method 

13.3 

Integration 

Definition 

16.2 (skip 16.2.1) 


Basic properties 

16.3 


Finding areas 

16.4 


Estimating definite integrals 

16.5, Appendix B 


Average values/Mean Value Theorem 

16.6 


Basic examples 

17.4, 17.6 


Substitution 

18.1 


Integration by parts 

18.2 


Partial fractions 

18.3 


Trig integrals 

19.1, 19.2 


Trig substitutions 

19.3 (skip 19.3.6) 


Overview of integration techniques 

19.4 

Motion 

Velocity and acceleration 



Constant acceleration 



Simple harmonic motion 

7.2.2 


Finding displacements 

16.1.1 

Improper 

Basics 

20.1 ， 20.2 

integrals 

Problem-solving techniques 

all of Chapter 21 

Infinite series 

Basics 

22.1.2, 22.2 


Problem-solving techniques 

all of Chapter 23 

Taylor series and 

Estimation and error estimates 

all of Chapter 25 

power series 

Power/Taylor series problems 

all of Chapter 26 

Differential 

Separable first-order 

30.2 

equations 

First-order linear 

30.3 


Constant coefficients 

30.4 


Modeling 

30.5 

Miscellaneous 

Parametric equations 

27.1 

topics 

Polar coordinates 

27.2 


Complex numbers 

28.1-28.5 


Volumes 

29.1, 29.2 


Arc lengths 

29.3 


Surface areas 

29.4 


Unless specified otherwise, the Section(s) column includes all subsections; for example, 
6.2 includes 6.2.1 through 6.2.7. 
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CHAPTER 1 


Function^ Graphs, and Uhe$ 


Trying to do calculus without using functions would be one of the most point¬ 
less things you could do. If calculus had an ingredients list, functions would 
be first on it, and by some margin too. So, the first two chapters of this book 
are designed to jog your memory about the main features of functions. This 
chapter contains a review of the following topics: 

• functions: their domain, codomain, and range, and the vertical line test; 

• inverse functions and the horizontal line test; 

• composition of functions; 

• odd and even functions; 

• graphs of linear functions and polynomials in general, as well as a brief 
survey of graphs of rational functions, exponentials, and logarithms; and 

• how to deal with absolute values. 

Trigonometric functions, or trig functions for short, are dealt with in the next 
chapter. So, let’s kick off with a review of what a function actually is. 


functions 


A function is a rule for transforming an object into another object. The 
object you start with is called the input, and comes from some set called the 
domain. What you get back is called the output^ it comes from some set 
called the codomain. 

Here are some examples of functions: 

• Suppose you write f(x) = x 2 . You have just defined a function / which 
transforms any number into its square. Since you didn’t say what the 
domain or codomain are, it’s assumed that they are both R, the set of all 
real numbers. So you can square any real number, and get a real number 
back. For example, / transforms 2 into 4; it transforms —1/2 into 1/4; 
and it transforms 1 into 1. This last one isn’t much of a change at all, but 
that’s no problem: the transformed object doesn’t have to be different 
from the original one. When you write /(2) = 4, what you really mean 
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is that / transforms 2 into 4. By the way, / is the transformation 
rule, while f(x) is the result of applying the transformation rule to the 
variable x. So it’s technically not correct to say u f(x) is a function ”； it 
should be “/ is a function.” 

• Now, let g(x) = x 2 with domain consisting only of numbers greater than 
or equal to 0. (Such numbers are called nonnegative.) This seems like 
the same function as /, but it’s not: the domains are different. For 
example, /(—1/2 ) 二 1/4, but g(—l/2) isn’t defined. The function g just 
chokes on anything not in the domain, refusing even to touch it. Since 
g and / have the same rule, but the domain of g is smaller than the 
domain of /, we say that g is formed by restricting the domain of /. 

• Still letting f(x) = a: 2 , what do you make of /(horse)? Obviously this is 
undefined, since you can’t square a horse. On the other hand, let’s set 

h(x) = number of legs x has, 


where the domain of h is the set of all animals. So /i(horse) = 4, while 
/i(ant) = 6 and /i(salmon) = 0. The codomain could be the set of 
all nonnegative integers, since animals don’t have negative or fractional 
numbers of legs. By the way, what is /i(2)? This isn’t defined, of course, 
since 2 isn’t in the domain. How many legs does a “2” have, after 
all? The question doesn’t really make sense. You might also think that 
/i(chair) = 4, since most chairs have four legs, but that doesn’t work 
either, since a chair isn’t an animal, and so “chair” is not in the domain 
of h. That is, /i(chair) is undefined. 

• Suppose you have a dog called Junkster. Unfortunately, poor Junkster 
has indigestion. He eats something, then chews on it for a while and 
tries to digest it, fails, and hurls. Junkster has transformed the food 
into ... something else altogether. We could let 

j(x) = color of barf when Junkster eats x, 

where the domain of j is the set of foods that Junkster will eat. The 
codomain is the set of all colors. For this to work, we have to be confident 
that whenever Junkster eats a taco, his barf is always the same color 
(say, red). If it’s sometimes red and sometimes green, that’s no good: a 

function must assign a unique output for each valid input. 

Now we have to look at the concept of the range of a function. The range is 
the set of all outputs that could possibly occur. You can think of the function 
working on transforming everything in the domain, one object at a time; the 
collection of transformed objects is the range. You might get duplicates, but 
that’s OK. 

So why isn’t the range the same thing as the codomain? Well, the range 
is actually a subset of the codomain. The codomain is a set of possible 
outputs, while the range is the set of actual outputs. Here are the ranges of 
the functions we looked at above: 
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• If f(x) = x 2 with domain R and codomain R, the range is the set of 
nonnegative numbers. After all, when you square a number, the result 
cannot be negative. How do you know the range is all the nonnegative 
numbers? Well, if you square every number, you definitely cover all 
nonnegative numbers. For example, you get 2 by squaring y/2 (or —\/2). 

• If g(x) = x 2 , where the domain of g is only the nonnegative numbers 
but the codomain is still all of R, the range will again be the set of 
nonnegative numbers. When you square every nonnegative number, you 
still cover all the nonnegative numbers. 

• If h(x) is the number of legs the animal x has, then the range is all 
the possible numbers of legs that any animal can have. I can think of 
animals that have 0, 2, 4, 6, and 8 legs, as well as some creepy-crawlies 
with more legs. If you include individual animals which have lost one or 
more legs, you can also include 1, 3, 5, and 7 in the mix, as well as other 
possibilities. In any case, the range of this function isn’t so clear-cut; 
you probably have to be a biologist to know the real answer. 

• Finally, if j(x) is the color of Junkster’s barf when he eats a:, then the 
range consists of all possible barf-colors. I dread to think what these 
are, but probably bright blue isn’t among them. 

1.1.1 Interval ： notation 

In the rest of this book, our functions will always have codomain M, and the 
domain will always be as much of M as possible (unless stated otherwise). 
So we’ll often be dealing with subsets of the real line, especially connected 
intervals such as {x : 2 < a; < 5}. It’s a bit of a pain to write out the full set 
notation like this, but it sure beats having to say “all the numbers between 2 
and 5, including 2 but not 5 •” We can do even better using interval notation. 

We’ll write [a, b] to mean the set of all numbers between a and 6, including 
a and b themselves. So [a, 6] means the set of all x such that a < x <b. For 
example, [2,5] is the set of all real numbers between 2 and 5, including 2 and 
5. (It’s not just the set consisting of 2, 3, 4, and 5: don’t forget that there are 
loads of fractions and irrational numbers between 2 and 5, such as 5/2, y/7, 
and 7r.) An interval such as [a, b] is called closed. 

If you don’t want the endpoints, change the square brackets to parentheses. 
In particular, (a, 6) is the set of all numbers between a and 6, not including a 
or b. So if x is in the interval (a, 6), we know that a < x < b. The set (2,5) 
includes all real numbers between 2 and 5, but not 2 or 5. An interval of the 
form (a, 6) is called open. 

You can mix and match: [a, b) consists of all numbers between a and 6, 
including a but not b. And (a, b] includes b but not a. These intervals are 
closed at one end and open at the other. Sometimes such intervals are called 
half-open. An example is the set {a: : 2 < x < 5} from above, which can also 
be written as [2,5). 

There’s also the useful notation (a, oo) for all the numbers greater than a 
not including a; [a, oo) is the same thing but with a included. There are three 
other possibilities which involve —oo; all in all, the situation looks like this: 
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(a, 6) 


a <x <b} ◄ —— 

M] 


a < x <b} ◄ —— 

M] 


a < x <b} ◄ —— 

[a, 6) 


a < x <b} ◄ —— 

(a, oo) 


> ft} — 

[a, oo) 


x>a} < — 

(—00, b) 

i x 

x <b} ◄— 

(-oo,6] 

i x 

x <b} ◄— 

(- 00 , 00 ) 

R 

一 


1.1.2 Finding the domain 



Sometimes the definition of a function will include the domain. (This was 
the case, for example, with our function g from Section 1.1 above.) Most of 
the time, however, the domain is not provided. The basic convention is that 
the domain consists of as much of the set of real numbers as possible. For 
example, if k(x) = y/x, the domain can’t be all of 1R, since you can’t take the 
square root of a negative number. The domain must be [0, oo), which is just 
the set of all numbers greater than or equal to 0. 

OK, so square roots of negative numbers are bad. What else can cause a 
screw-up? Here’s a list of the three most common possibilities: 

1. The denominator of a fraction can’t be zero. 


2. You can’t take the square root (or fourth root, sixth root, and so on) of 
a negative number. 

3. You can’t take the logarithm of a negative number or of 0. (Remember 
logs? If not, see Chapter 9!) 


You might recall that tan(90°) is also a problem, but this is really a special 
case of the first item above. You see, 


tan(90°)= 


sin(90°) 

cos(90°) 


5 , 



so the reason tan(90°) is undefined is really that a hidden denominator is zero. 
Here’s another example: if we try to define 

log 10 Or + 8)V26 — 2x 
I(X) ~ — (x-2)(x + 19 ) —， 


then what is the domain of /? Well, for f(x) to make sense, here’s what needs 
to happen: 

• We need to take the square root of (26 — 2x), so this quantity had better 
be nonnegative. That is, 26 — 2a: > 0. This can be rewritten as a: < 13. 
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• We also need to take the logarithm of {x + 8), so this quantity needs to 
be positive. (Notice the difference between logs and square roots: you 
can take the square root of 0, but you can’t take the log of 0.) Anyway, 
we need o: + 8 > 0, so x > —8. So far, we know that —8 < a: < 13, so 
the domain is at most (—8,13]. 

• The denominator can’t be 0; this means that (x—2) ^ 0 and (x-\-19) ^ 0. 
In other words, x ^ 2 and x ^ —19. This last one isn’t a problem, since 
we already know that x lies in (—8,13], so x can’t possibly be —19. We 
do have to exclude 2, though. 

So we have found that the domain is the set (—8,13] except for the number 
2. This set could be written as (—8,13]\{2}. Here the backslash means “not 
including.” 

3 Finding the range using the graph 

Let’s define a new function F by specifying that its domain is [—2,1] and that 
F(pc) = x 2 on this domain. (Remember, the codomain of any function we 
look at will always be the set of all real numbers.) Is F the same function as 
f, where f(x) = x 2 for all real numbers xl The answer is no, since the two 
functions have different domains (even though they have the same rule). As 
in the case of the function g from Section 1.1 above, the function F is formed 
by restricting the domain of /. 

Now, what is the range of F? Well, what happens if you square every 
number between —2 and 1 inclusive? You should be able to work this out 
directly, but this is a good opportunity to see how to use a graph to find the 
range of a function. The idea is to sketch the graph of the function, then 
imagine two rows of lights shining from the far left and far right of the graph 
horizontally toward the y-axis. The curve will cast two shadows, one on the 
left side and one on the right side of the y-axis. The range is the union of 
both shadows: that is, if any point on the y-axis lies in either the left-hand or 
the right-hand shadow, it is in the range of the function. Let’s see how this 
works with our function F: 


-> 
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The left-hand shadow covers all the points on the y-axis between 0 and 4 
inclusive, which is [0,4]; on the other hand, the right-hand shadow covers 
the points between 0 and 1 inclusive, which is [0,1]. The right-hand shadow 
doesn’t contribute anything extra: the total coverage is still [0,4]. This is the 
range of F. 

1.1.4 The vertical line test 



In the last section, we used the graph of a function to find its range. The graph 
of a function is very important: it really shows you what the function “looks 
like.” We’ll be looking at techniques for sketching graphs in Chapter 12, but 
for now I’d like to remind you about the vertical line test. 

You can draw any figure you like on a coordinate plane, but the result 
may not be the graph of a function. So what’s special about the graph of a 
function? What is the graph of a function /, anyway? Well, it’s the collection 
of all points with coordinates (x, where x is in the domain of /. Here’s 
another way of looking at this: start with some number x. If x is in the 
domain, you plot the point (x, /(a;)), which of course is at a height of f(x) 
units above the point x on the x-axis. If a; isn’t in the domain, you don’t plot 
anything. Now repeat for every real number x to build up the graph. 

Here’s the key idea: you can’t have two points with the same ^-coordinate. 
In other words, no two points on the graph can lie on the same vertical line. 
Otherwise, how would you know which of the two or more heights above the 
point x on the $-axis corresponds to the value of /⑷？ So, this leads us to 
the vertical line test: if you have some graph and you want to know whether 
it’s the graph of a function, see whether any vertical line intersects the graph 
more than once. If so, it’s not the graph of a function; but if no vertical line 
intersects the graph more than once, you are indeed dealing with the graph 
of a function. For example, the circle of radius 3 units centered at the origin 
has a graph like this: 



Such a commonplace object should be a function, right? No, check the vertical 
lines that are shown in the diagram. Sure, to the left of —3 or to the right 
of 3, there’s no problem — the vertical lines don’t even hit the graph, which is 
fine. Even at —3 or 3, the vertical lines only intersect the curve in one point 
each, which is also fine. The problem is when x is in the interval (—3,3). For 
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any of these values of the vertical line through (: r, 0) intersects the circle 
twice, which screws up the circle’s potential function-status. You just don’t 
know whether f(x) is the top point or the bottom point. 

The best way to salvage the situation is to chop the circle in half hori¬ 
zontally and choose only the top or the bottom half. The equation for the 
whole circle is x 2 y 2 = 9, whereas the equation for the top semicircle is 
y = y/9 — x 2 . The bottom semicircle has equation y = —y/9 — x 2 . These last 
two are functions, both with domain [—3,3]. If you felt like chopping in a 
different way, you wouldn’t actually have to take semicircles — you could chop 
and change between the upper and lower semicircles, as long as you don’t vi¬ 
olate the vertical line test. For example, here’s the graph of a function which 
also has domain [—3,3]: 



The vertical line test checks out, so this is indeed the graph of a function. 

2 Inversfefundtions 

Let’s say you have a function /. You present it with an input x\ provided that 
x is in the domain of /, you get back an output, which we call f(x). Now we 
try to do things all backward and ask this question: if you pick a number y, 
what input can you give to / in order to get back y as your output? 

Here’s how to state the problem in math-speak: given a number y, what 
x in the domain of / satisfies f(x) = yl The first thing to notice is that y 
has to be in the range of /. Otherwise, by definition there are no values of 
x such that f(x) = y. There would be nothing in the domain that / would 
transform into y, since the range is all the possible outputs. 

On the other hand, if y is in the range, there might be many values that 
work. For example, if f(x) = x 2 (with domain R), and we ask what value 
of x transforms into 64, there are obviously two values of x: 8 and —8. On 
the other hand, if g(x) = x 3 , and we ask the same question, there’s only one 
value of x, which is 4. The same would be true for any number we give to g 
to transform, because any number has only one (real) cube root. 

So, here’s the situation: we’re given a function /, and we pick y in the range 
of /. Ideally, there will be exactly one value of x which satisfies f(x) = y. 
If this is true for every value of y in the range, then we can define a new 
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function which reverses the transformation. Starting with the output y, the 
new function finds the one and only input x which leads to the output. The 
new function is called the inverse function of /, and is written as f - 1 . Here’s 
a summary of the situation in mathematical language: 

1. Start with a function / such that for any y in the range of /, there is 
exactly one number x such that f(x) = y. That is, different inputs give 
different outputs. Now we will define the inverse function f- 1 . 

2. The domain of / 一 1 is the same as the range of /. 

3. The range of / 一 1 is the same as the domain of /. 

4. The value of f~ 1 {y) is the number x such that f{x) = y. So, 

if f{xf^y, then / _1 (y) = x. 

The transformation /— 1 acts like an undo button for f: if you start with x 
and transform it into y using the function /, then you can undo the effect of 
the transformation by using the inverse function / _1 on y to get x back. 

This raises some questions: how do you see if there’s only one value of x 
that satisfies the equation f(x) = yl If so, how do you find the inverse, and 
what does its graph look like? If not, how do you salvage the situation? We’ll 
answer these questions in the next three sections. 

1.2.1 t ho rizdot# 

For the first question — how to see that there’s only one value of x that works 
for any y in the range — perhaps the best way is to look at the graph of your 
function. We want to pick y in the range of / and hopefully only have one value 
of x such that f(x) = y. What this means is that the horizontal line through 
the point (0,y) should intersect the graph exactly once, at some point (x,y). 
That x is the one we want. If the horizontal line intersects the curve more 
than once, there would be multiple potential inverses x, which is bad. In that 
case, the only way to get an inverse function is to restrict the domain; we’ll 
come back to this very shortly. What if the horizontal line doesn’t intersect 
the curve at all? Then y isn’t in the range after all, which is OK. 

So, we have just described the horizontal line test: if every horizontal line 
intersects the graph of a function at most once, the function has an inverse. 
If even one horizontal line intersects the graph more than once, there isn’t an 
inverse function. For example, look at the graphs of f(x) = x s and g(x) = x 2 : 
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No horizontal line hits y = f(x) more than once, so / has an inverse. On the 
other hand, some of the horizontal lines hit the curve y = g(x) twice, so g 
has no inverse. Here’s the problem: if you want to solve y = x 2 for x, where 
y is positive, then there are two solutions, x = y/y and x = —^/y. You don’t 
know which one to take! 

1.2.2 Rnding the inverse 

Now let’s move on to the second of our questions: how do you find the inverse 
of a function /? Well, you write down y = f(x) and try to solve for x. In 
our example of f(x) = a: 3 , we have y = a: 3 , so x = ^/y. This means that 
f 一 Hy) = If the variable y here offends you, by all means switch it to 

x: you can write f~ 1 {x) = ^/x if you prefer. Of course, solving for x is not 
always easy and in fact is often impossible. On the other hand, if you know 
what the graph of your function looks like, the graph of the inverse function 
is easy to find. The idea is to draw the line y = x on the graph, then pretend 
that this line is a two-sided mirror. The inverse function is the reflection of 
the original function in this mirror. When f(x) = a: 3 , here’s what / 一 1 looks 
like: 



The original function / is reflected in the mirror y = x to get the inverse 
function. Note that the domain and range of both / and / _1 are the whole 
real line. 

1.2.3 Restricting the domain 

Finally, we’ll address our third question: if the horizontal line test fails and 
there’s no inverse, what can be done? Our problem is that there are multiple 
values of x that give the same y. The only way to get around the problem 
is to throw away all but one of these values of x. That is, we have to decide 
which one of our values of x we want to keep, and throw the rest away. As we 
saw in Section 1.1 above, this is called restricting the domain of our function. 
Effectively, we ghost out part of the curve so that what’s left no longer fails 
the horizontal line test. For example, if g(x) = x 2 , we can ghost out the left 
half of the graph like this: 
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The new (unghosted) curve has the reduced domain [0, oo) and satisfies the 
horizontal line test, so there is an inverse function. More precisely, the function 
h, which has domain [0, oo) and is defined by h(x) = x 2 on this domain, has 
an inverse. Let’s play the reflection game to see what it looks like: 



To find the equation of the inverse, we have to solve for x in the equation 
y = x 2 . Clearly the solution is a: = y/y or x = —y/y, but which one do we 
need? We know that the range of the inverse function is the same as the 
domain of the original function, which we have restricted to be [0,oo). So 
we need a nonnegative number as our answer, and that has to be x = y/y. 
That is, h~ 1 (y) = y/y. Of course, we could have ghosted out the right half of 
the original graph to restrict the domain to (—oo,0]. In that case, we’d get a 
function j which has domain (—oo, 0] and again satisfies j(x) = a: 2 , but only 
on this domain. This function also has an inverse, but the inverse is now the 
negative square root: j _1 (2/) = — 

By the way, if you take the original function g given by g(x) = x 1 with 
domain (— 00 , 00 ), which fails the horizontal line test, and try to reflect it in 
the mirror y = x, you get the following picture: 
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Notice that the graph fails the vertical line test, so it’s not the graph of a 
function. This illustrates the connection between the vertical and horizontal 
line tests — when horizontal lines are reflected in the mirror y = x ，they become 
vertical lines. 


1.2.4 鐵 of in 梅酿你 nclions 


◎ 




One more thing about inverse functions: if / has an inverse, it’s true that 
= x for all x in the domain of /, and also that = y for 

all y in the range of /. (Remember, the range of / is the same as the domain 
of / _1 , so you can indeed take f~ 1 (y) for y in the range of / without causing 
any screwups.) 

For example, if f(x)= x s , then / has an inverse given by f~ 1 {x) = 
and so / _1 (/(a;)) = = x for any x. Remember, the inverse function is 

like an undo button. We use x as an input to /, and then give the output to 
/ _1 ; this undoes the transformation and gives us back x, the original number. 
Similarly, = (^y) s = y. So / _1 is the inverse function of /, and 

/is the inverse function of / 一 1 . In other words, the inverse of the inverse is 
the original function. 

Now, you have to be careful in the case where you restrict the domain. Let 
g(x) = x 2 \ we’ve seen that you need to restrict the domain to get an inverse. 
Let’s say we restrict the domain to [0, oo) and carelessly continue to refer to 
the function as g instead of /i, as in the previous section. We would then say 
that g _1 {x) = y/x. If you calculate g(g~ 1 (x)), you find that this is (y/x) 2 , 
which equals x, provided that a: > 0. (Otherwise you can’t take the square 
root in the first place.) 

On the other hand, if you work out g~ 1 (g(x)), you get which is not 
always the same thing as x. For example, ii x = —2, then x 2 = 4 ： and so 
Vx^ = V^4 = 2. So it’s not true in general that g~ 1 {g(x)) = x. The problem 
is that —2 isn’t in the restricted-domain version of g. Technically, you can’t 
even compute ^(—2), since —2 is no longer in the domain of g. We really 
should be working with /i, not g, so that we remember to be more careful. 
Nevertheless, in practice, mathematicians will often restrict the domain with¬ 
out changing letters! So it will be useful to summarize the situation as follows: 

If the domain of a function / can be restricted so that / has an inverse 
/ -1， then 

• /(/ _1 (2/)) = V for all y in the range of /; but 

• / _1 (/(a;)) may not equal x\ in fact, = x only when x is in 

the restricted domain. 


We’ll be revisiting these important points in the context of inverse trig func¬ 
tions in Section 10.2.6 of Chapter 10. 


1.3 Composition of Functions 

Let’s say we have a function g given by g(x) = x 2 . You can replace x by 
anything you like, as long as it makes sense. For example, you can write 







1 2 • Functions, Graphs, and Lines 


◎ 

◎ 


g(y) = y 2 , or g(x + 5) = (a: + 5) 2 . This last example shows that you need to 
be very careful with parentheses. It would be wrong to write g(x-\-5) = ar+5 2 , 
since this is just x + 25, which is not the same thing as (x + 5) 2 . If in doubt, 
use parentheses. That is, if you need to write out /(something), replace every 
instance of x by (something), making sure to include the parentheses. Just 
about the only time you don’t need to use parentheses is when the function is 
an exponential function —— for example, if h(x) = 3 X , then you can just write 
h(x 2 + 6) = 3^ 2+6 . You don’t need parentheses since you’re already writing 
the x 2 + 6 as a superscript. 

Now consider the function / defined by f(x) = cos(a: 2 ). If I give you a 
number x, how do you compute f(x)? Well, first you square it, then you take 
the cosine of the result. Since we can decompose the action of f(x) into these 
two separate actions which are performed one after the other, we might as 
well describe those actions as functions themselves. So, let g(x) = x 2 and 
h(x) = cos(x). To simulate what / does when you use x as an input, you 
could first give a: to p to square it, and then instead of taking the result back 
you could ask g to give its result to h instead. Then h spits out a number, 
which is the final answer. The answer will, of course, be the cosine of what 
came out of g, which was the square of the original x. This behavior exactly 
mimics /, so we can write f(x) = h(g(x)). Another way of expressing this is 
to write f = ho g\ here the circle means “composed with.” That is, f is h 
composed with g, or in other words, / is the composition of h and g. What’s 
tricky is that you write h before g (reading from left to right as usual!) but 
you apply g first. I agree that it’s confusing, but what can I say —— you just 
have to deal with it. 

It’s useful to practice composing two or more functions together. For 
example, if g(x) = 2 X , h(x) = 5a: 4 , and j(x) = 2x — 1, what is a formula for 
the function f = g o h o j? Well, just replace one thing at a time, starting 
with j, then h, then g. So: 

/0) = g(h{j(x))) = g(h(2x-l)) = g{5{2x-l) 4 ) = 2 5(2x - 1)4 _ 

You should also practice reversing the process. For example, suppose you 
start off with 

’ ( X ) tan(5 log 2 (x + 3)) • 


How would you decompose / into simpler functions? Zoom in to where you 
see the quantity x. The first thing you do is add 3, so let g(x) = a: + 3. 
Then you have to take the base 2 logarithm of the resulting quantity, so set 
h(x) = log 2 (a:). Next, multiply by 5, so set j{x) = 5x. Then take the tangent, 
so put k(x) = tan(x). Finally, take reciprocals, so let m(x) = 1/x. With all 
these definitions, you should check that 


/(a;) = 細 . 


Using the composition notation, you can write 
f = mokojohog. 
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This isn’t the only way to break down /. For example, we could have combined 
h and j into another function n, where n(x) = 51og 2 (a:). Then you should 
check that n = j o h, and 

f = mo k o no g. 


Perhaps the original decomposition (involving j and h) is better because it 
breaks down / into more elementary steps, but the second one (involving n) 
isn’t wrong. After all, n(x) = 51og 2 (a:) is still a pretty simple function of x. 

Beware: composition of functions isn’t the same thing as multiplying them 
together. For example, if f(x) = x 2 sin(a;), then / is not the composition of 
two functions. To calculate f(x) for any given you actually have to find 
both x 2 and sin(a:) (it doesn’t matter which one you find first, unlike with 
composition) and then multiply these two things together. If g(x) = x 2 and 
h(x) = sin (: r), then we’d write f(x) = g(x)h(x), or / = gh. Compare this to 
the composition of the two functions, j = g o h, which is given by 


j(x) = g{h{x)) = g(sm(x)) = (sin ⑷ ) 2 


or simply j(x) = sin 2 (a:). The function j is a completely different function 
from the product x 2 sin(a;). It’s also different from the function k = h o g, 
which is also a composition of g and h but in the other order: 


k(x) = h(g(x)) = h(x 2 ) = sin(a: 2 ). 


This is yet another completely different function. The moral of the story is 
that products and compositions are not the same thing, and furthermore, the 
order of the functions matters when you compose them, but not when you 
multiply them together. 

One simple but important example of composition of functions occurs 
when you compose some function / with g(pc) = x — a, where a is some 
constant number. You end up with a new function h given by h(pc) = f(x — a). 
A useful point to note is that the graph of y = h(x) is the same as the graph 
f(x), except that it’s shifted over a units to the right. If a is negative, 



= (x— l) 2 ? r 
the graph of y = 
! this: 



;shifted to 
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Similarly, the graph oi y = (x + 2) 2 is the graph of y = x 2 shifted to the left 
by 2 units, since you can interpret (x + 2) as (x — (—2)). 


1.4 Odd and : Even Functions 



Some functions have some symmetry properties that make them easier to deal 
with. Consider the function / given by f(x) = x 2 . Pick any positive number 
you like (I’ll choose 3) and hit it with / (I get 9). Now take the negative of 
that number, —3 in my case, and hit that with / (I get 9 again). You should 
get the same answer both times, as I did, regardless of which number you 
chose. You can express this phenomenon by writing f(—x) = f(x) for all x. 
That is, if you give a: to / as an input, you get back the same answer as if 
you used the input —x instead. Notice that g(x) = x A and h(pc) = x 6 also 
have this property —— in fact, j(x) = x n , where n is any even number (n could 
in fact be negative), has the same property. Inspired by this, we say that a 
function / is even if f(—x) = f(x) for all x in the domain of /. It’s not good 
enough for this equation to be true for some values of x\ it has to be true for 
all x in the domain of /. 

Now, let’s say we play the same game with f(pc) = x 3 . Take your favorite 
positive number (I’ll stick with 3) and hit that with / (I get 27). Now try 
again with the negative of your number, —3 in my case; I get —27, and you 
should also get the negative of what you got before. You can express this 
mathematically as f{—x) = Once again, the same property holds for 

j(x) = x n when n is any odd number (and once again, n could be negative). 
So, we say that a function / is odd if f{—x) = —f(x) for all x in the domain 
of/. 

In general, a function might be odd, it might be even, or it might be 
neither odd nor even. Don’t forget this last point! Most functions are neither 
odd nor even. On the other hand, there’s only one function that’s both odd 
and even, which is the rather boring function given by f(x) = 0 for all x (we’ll 
call this the “zero function”）. Why is this the only odd and even function? 
Let’s convince ourselves. If the function / is even, then f(—x) = f(x) for 
all x. But if it’s also odd, then f(—x) = —f(x) for all x. Take the first of 
these equations and subtract the second from it. You should get 0 = 2f(x), 
which means that f(x) = 0. This is true for all rr, so the function / must 
just be the zero function. One other nice observation is that if a function 
/is odd, and the number 0 is in its domain, then /(0) = 0. Why is it so? 
Because f(—x) = —f{x) is true for all x in the domain of /， so let’s try it for 
x = 0. You get /(—0) = —/(0). But —0 is the same thing as 0, so we have 
f ⑼ =—f(0). This simplifies to 2/(0) = 0, or f(0) = 0 as claimed. 

Anyway, starting with a function /, how can you tell if it is odd, even, or 
neither? And so what if it is odd or even anyway? Let’s look at this second 
question before coming back to the first one. One nice thing about knowing 
that a function is odd or even is that it’s easier to graph the function. In fact, 
if you can graph the right-hand half of the function, the left-hand half is a 
piece of cake! Let’s say that / is an even function. Then since f(x) = 
the graph of y = f(x) is at the same height above the ^-coordinates x and 
—x. This is true for all x, so the situation looks something like this: 
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We can conclude that the graph of an even function has mirror sym¬ 
metry about the y-axis. So, if you graph the right half of a function which 
you know is even, you can get the left half by reflecting the right half about 
the y-axis. Check the graph of y = x 2 to make sure that it has this mirror 


symmetry. 


On the other hand, let’s say that / is an odd function. Since we have 
f(—x) = —/Or)，the graph of y = f(x) is at the same height above the 
x-coordinate x as it is below the ^-coordinate —x. (Of course, if f(x) is 
negative, then you have to switch the words “above” and “below.”）In any 
case, the picture looks like this: 




The symmetry is now a point symmetry about the origin. That is, the graph 
of an odd function has 180° point symmetry about the origin. This 
means that if you only have the right half of a function which you know is 
odd, you can get the left half as follows. Pretend that the curve is sitting 
on top of the paper, so you can pick it up if you like but you can’t change 
its shape. Instead of picking it up, put a pin through the curve at the origin 
(remember, odd functions must pass through the origin if they are defined at 
0) and then spin the whole curve around half a revolution. This is what the 
left-hand half of the graph looks like. (This doesn’t work so well if the curve 
isn’t continuous, that is, if the curve isn’t all in one piece!) Check to see that 
the above graph and also the graph of y = x 3 have this symmetry. 

Now, suppose / is defined by the equation f(x) = log 5 (2a; 6 — 6x 2 + 3). How 
do you tell if / is odd, even, or neither? The technique is to calculate f(—x) 
by replacing every instance of x with (—a:), making sure not to forget the 
parentheses around —a:, and then simplifying the result. If you end up with 
the original expression f(x), then / is even; if you end up with the negative of 
the original expression /(—$), then / is odd; if you end up with a mess that 
isn’t either f(x) or then / is neither (or you didn’t simplify enough!). 
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In the example above, you’d write 



f(~x) = log 5 (2(-a;) 6 - Q{-xf + 3) = log 5 (2a; 6 - 6a; 2 + 3), 


which is actually equal to the original f(x). So the function / is even. How 
about 


9(x)= 


2x 3 -\-x 
Sx 2 + 5 


and 


h(x)= 


2x 3 +x-1^ 
3x 2 + 5 


Well, for g, we have 


9{~x) m 


2(~^) 3 + H 

3(-x) 2 + 5 


_ -2a; 3 - x 
= 3x 2 +5 


Now you have to observe that you can take the minus sign out front and write 


a{-x)= 


2x 3 + x 
3x 2 + 5 y 


which, you notice, is equal to —g(x). That is, apart from the minus sign, we 
get the original function back. So, g is an odd function. How about hi We 
have 


h(—x )= 


2(-x) s + {-x) - 1 
~3(-x) 2 + 5~ 


-2a: 3 - x-1 
3x 2 + 5~ 


Once again, we take out the minus sign to get 
2x s -\-x-\ 


h(-x) = ~- 


3a: 2 + 5 



Hmm, this doesn’t appear to be the negative of the original function, because 
of the +1 term in the numerator. It’s not the original function either, so the 
function h is neither odd nor even. 

Let’s look at one more example. Suppose you want to prove that the 
product of two odd functions is always an even function. How would you go 
about doing this? Well, it helps to have names for things, so let’s say we have 
two odd functions / and g. We need to look at the product of these functions, 
so let’s call the product h. That is, we define h(pc) = f(x)g(x). So, our task is 
to show that h is even. We’ll do this by showing that h(—x) = h(x)，as usual. 
It will be helpful to note that f(—x) = —f(x) and g{—x) = —g(x)^ since / 
and g are odd. Let’s start with h(—x). Since h is the product of / and we 
have h(—x) = f(—x)g(—x). Now we use the oddness of / and g to express 
this last term as The minus signs now come out front and 

cancel out, so this is the same thing as f(x)g(x) which of course equals h(x). 
We could (and should) express all this text mathematically like this: 



h(~x) = f(-x)g{-x) = = f{x)g(x) = h(x). 

Anyway, since h(—x) = h(x), the function h is even. Now you should try to 
prove that the product of two even functions is always even, and also that the 
product of an odd and an even function must be odd. Go on, do it now! 
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1 .^/Graphs of Linear Functions 

Functions of the form f(x) = mx + b are called linear. There’s a good reason 
for this: the graphs of these functions are lines. (As far as we’re concerned, 
the word “line” always means “straight line.”）The slope of the line is given 
by m. Imagine for a moment that you are in the page, climbing the line as 
if it were a mountain. You start at the left side of the page and head to the 
right, like this: 





If the slope m is positive, as it is in the above picture, then you are heading 
uphill. The bigger m is, the steeper the climb. On the other hand, if the 
slope is negative, then you are heading downhill. The more negative the 
slope, the steeper the downhill grade. If the slope is zero, then the line is flat, 
or horizontal ― you’re going neither uphill nor downhill, just trudging along a 
flat line. 

To sketch the graph of a linear function, you only need to identify two 
points on the graph. This is because there’s only one line that goes through 
two different points. You just put your ruler on the points and draw the line. 
One point is easy to find, namely, the y-intercept. Set a: = 0 in the equation 
y = mx + b, and you see that y = mxO-\-b=b. That is, the y-intercept is 
equal to 6, so the line goes through (0,6). To find another point, you could 
find the a:-intercept by setting y = 0 and finding what x is. This works pretty 
well except in two cases. The first case is when 6 = 0， in which case we are 
just dealing with y = mx. This goes through the origin, so the x-intercept 
and the y-intercept are both zero. To get another point, you’ll just have to 
substitute in a: = 1 and see that y = m. So, the line y = mx goes through 
the origin and (l,m). For example, the line y = —2x goes through the origin 
and also through (1, —2), so it looks like this: 





1 8 • Functions, Graphs, and Lines 



The other bad case is when m = 0. But then we just have y = b, which is a 
horizontal line through (0, b). 

For a more interesting example, consider y = — 1. The "-intercept is 

—1, and the slope is To sketch the line, find the ^-intercept by setting 
y = 0. We get 0 = — 1, which simplifies to x = 2. So, the line looks like 

this: 



Now, let’s suppose you know that you have a line in the plane, but you don’t 
know its equation. If you know it goes through a certain point, and you know 
what its slope is, then you can find the equation of the line. You really, really, 
really need to know how to do this, since it comes up a lot. This formula, 
called the point-slope form of a linear function, is what you need to know: 



If a line goes through (xo,yo) and has slope m, 
then its equation is y _ yo = m(x — xq). 

For example, what is the equation of the line through (—2,5) which has slope 
—3? It is y — 5 = —3(a: — (—2)), which you can expand and simplify down to 
y = —3$ _ 1. 

Sometimes you don’t know the slope of the line, but you do know two 
points that it goes through. How do you find the equation? The technique 
is to find the slope, then use the previous idea with one of the points (your 
choice) to find the equation. First, you need to know this: 



If a line goes through (xi,yi) and (^ 2 , 2 / 2 ), its slope is equal to ——— — 

X2 — X\ 


So, what is the equation of the line through (—3,4) and (2 ，一 6)? Let’s find 
the slope first: 



-10 


=— 2 . 


We now know that the line goes through (—3,4) and has slope —2，so its 
equation is 沒 一 4 = —2(x — (—3)), or after simplifying, y = —2x — 2. Alterna¬ 
tively, we could have used the other point (2, —6) with slope —2 to see that the 
equation of the line is y — (—6) = —2(x — 2), which simplifies to y = —2x — 2. 
Thankfully this is the same equation as before — it doesn’t matter which point 
you pick, as long as you have used both points to find the slope. 
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1.6 Common Functions and Graphs 

Here are the most important functions you should know about. 

1. Polynomials: these are functions built out of nonnegative integer powers 
of x. You start with the building blocks 1, a:, x 2 ^ x 3 , and so on, and you are 
allowed to multiply these basic functions by numbers and add a finite number 
of them together. For example, the polynomial f(x) = 5a; 4 —4a: 3 + 10 is formed 
by taking 5 times the building block x 4 , and —4 times the building block x s , 
and 10 times the building block 1， and adding them together. You might 
also want to include the intermediate building blocks x 2 and x, but since they 
don’t appear, you need to take 0 times of each. The amount that you multiply 
the building block x n by is called the coefficient of x n . For example, in the 
polynomial / above, the coefficient of x 4 is 5, the coefficient of x s is —4, the 
coefficients of x 2 and x are both 0, and the coefficient of 1 is 10. (Why allow 
x and 1, by the way? They seem different from the other blocks, but they’re 
not really: x = x 1 and 1 = a: 0 .) The highest number n such that x n has a 
nonzero coefficient is called the degree of the polynomial. For example, the 
degree of the above polynomial / is 4, since no power of x greater than 4 is 
present. The mathematical way to write a general polynomial of degree n is 

p(x) = a n x n + an-ix 71-1 H - h a^x 1 + a\x 4 - ao, 

where a n is the coefficient of x n ^ a n _i is the coefficient of ar n_1 , and so on 
down to ao, which is the coefficient of 1. 

Since the functions x n are the building blocks of all polynomials, you 
should know what their graphs look like. The even powers mostly look similar 
to each other, and the same can be said for the odd powers. Here’s what the 
graphs look like, from x° up to x 7 : 
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Sketching the graphs of more general polynomials is more difficult. Even find¬ 
ing the x-intercepts is often impossible unless the polynomial is very simple. 
There is one aspect of the graph that is fairly straightforward, which is what 
happens at the far left and right sides of the graph. This is determined by 
the so-called leading coefficient, which is the coefficient of the highest-degree 
term. This is basically the number a n defined above. For example, in our 
polynomial f(x) = 5x 4 — 4x s + 10 from above, the leading coefficient is 5. In 
fact, it only matters whether the leading coefficient is positive or negative. It 
also matters whether the degree of the polynomial is odd or even; so there are 
four possibilities for what the edges of the graph can look like: 



n even, a n > 0 


n odd, a n > 0 


n even, a n < 0 n odd, a n < 0 


The wiggles in the center of these diagrams aren’t relevant — they depend 
on the other terms of the polynomial. The diagram is just supposed to show 
what the graphs look like near the left and right edges. In this sense, the 
graph of our polynomial f(x) = 5x 4 — 4x 3 + 10 looks like the leftmost picture 
above, since n = 4 is even and a n = 5 is positive. 

Let’s spend a little time on degree 2 polynomials, which are called quadrat¬ 
ics. Instead of writing p(a:) = a 2 X 2 -\-aix-\-ao, it’s easier to write the coefficients 
as a, b, and c, so we have p(x) = ax 2 + bx c. Quadratics have two, one, 
or zero (real) roots, depending on the sign of the discriminant The discrimi¬ 
nant, which is often written as A, is given by A = 6 2 — 4ac. There are three 
possibilities. If A > 0, then there are two roots; if △ = 0, there is one root, 
which is called a double root; and if A < 0, then there are no roots. In the 
first two cases, the roots are given by 


—b =b \/b 2 — 4ac 

Ya 



Notice that the expression in the square root is just the discriminant. An im¬ 
portant technique for dealing with quadratics is completing the square. Here’s 
how it works. We’ll use the example of the quadratic 2a: 2 — 3 工 + 10. The 
first step is to take out the leading coefficient as a factor. So our quadratic 
becomes 2(x 2 — \x -f 5). This reduces the situation to dealing with a monic 
quadratic, which is a quadratic with leading coefficient equal to 1. So, let’s 
worry about x 2 — |a: + 5. The main technique now is to take the coefficient 
of x, which in our example is —divide it by 2 to get —and square it. We 
get jq. We wish that the constant term were 吾 instead of 5, so let’s do some 


















mental gymnastics: 
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x 2 — + 5 ： 


+ 5- 


Why on earth would we want to add and subtract 吾？ Because the first three 
terms combine to form (x — |) 2 . So, we have 


-x + 5 = ( x 2 - -x + — ) -f 5 - 


+ 5- 


16 . 


Now we just have to work out the last little bit, which is just arithmetic: 
5 — 吾 =Putting it all together, and restoring the factor of 2, we have 


2? - 3a: + 10 = 2 f a: 2 - + 5 j = 2 


( 3V 71 1 


= 2 d 


It turns out that this is a much nicer form to deal with in a number of situa¬ 
tions. Make sure you know how to complete the square, since we’ll be using 
this technique a lot in Chapters 18 and 19. 


2. Rational functions: these are functions of the form 

p(x) 

g ㈤’ 

where p and q are polynomials. Rational functions will pop up in many 
different contexts, and the graphs can look very different depending on the 
polynomials p and q. The simplest examples of rational functions are poly¬ 
nomials themselves, which arise when q(pc) is the constant polynomial 1. The 
next simplest examples are the functions l/x n , where n is a positive integer. 
Let’s look at some of the graphs of these functions: 



The odd powers look similar to each other, and the even powers look 
similar to each other. It’s worth knowing what these graphs look like. 
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The domain is (0, oo); note that this backs up what I said earlier about not 
being able to take logarithms of a negative number or of 0. The range is all of 
(— 00 , 00 )，and there’s a vertical asymptote at x = 0. The graphs of log 10 (a:), 
and indeed log b (a:) for any 6 > 1, are very similar to this one. The log func¬ 
tion is very important in calculus, so you should really know how to draw the 
above graph. We’ll look at other properties of logarithms in Chapter 9. 

4. Trig functions: these are so important that the entire next chapter is 
devoted to them. 


5. Functions involving absolute values: let’s take a close look at the 
absolute value function f given by f(x) = \x\. Here’s the definition of | 工 |: 


\x if a: > 0, 
N= -x 


Another way of looking at | a: | is that it is the distance between x and 0 on 
the number line. More generally, you should learn this nice fact: 



|$ — y| is the distance between x and y on the number line. 

For example, suppose that you need to identify the region \x — 1| < 3 on the 
number line. You can interpret the inequality as “the distance between x and 
1 is less than or equal to 3.” That is, we are looking for all the points that 
are no more than 3 units away from the number 1. So, let’s take a number 
line and mark in the number 1 as follows: 


The points which are no more than 3 units away extend to —2 on the left and 
4 on the right, so the region we want looks like this: 

3 units 3 units 

< - X - > 

-2 ~~1 ~~~ 4 ~~^ 

So, the region |a: — 1| < 3 can also be described as [—2,4]. 
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It’s also true that |x| = Vx^". To check this, suppose that x > 0; then 
\fx^ = x, no problem. If instead a: < 0, then it can’t be true that Vx^ = x, 
since the left-hand side is positive but the right-hand side is negative. The 
correct equation is Vx^ = —x; now the right-hand side is positive, since it’s 
minus a negative number. If you look back at the definition of \x\, you’ll see 
that we have just proved that \x\ = V^. Even so, to deal with \x\, it’s much 
better to use the piecewise definition than to write it as Vx^. 


Finally, let’s take a look at some graphs. If you know what the graph of a 
function looks like, you can get the graph of the absolute value of that function 
by reflecting everything below the : r-axis up to above the x-axis, using the 
x-axis as your mirror. For example, here’s the graph of y = \x\, which comes 
from reflecting the bottom portion of y = x in the $-axis: 



y = \A 


mirror (a:-axis) 


How about the graph oi y = |log 2 (a:)|? Using the reflection of the graph of 
y = log 2 (a:) above, this is what the absolute value version looks like: 



V = |log 2 ⑻ I 
mirror (ar-axis) 


Anyway, that’s all I have to say about functions, apart from trig functions 
which are the subject of the next chapter. Hopefully you’ve seen a lot of the 
stuff in this chapter before. Most of the material in this chapter is used over 
and over again in calculus, so make sure you really get on top of it all as soon 
as you can! 











CHAPTER 2 


Review of Trigonometry 


To do calculus, you really need to know trigonometry. Truth be told, we won’t 
see much trig at first, but when it comes, it doesn’t let up. So we might as 
well do a thorough review of the most important aspects of trig: 

• angles in radians and the basics of the trig functions; 

• trig functions on the real line (not just angles between 0° and 90°); 

• graphs of trig functions; and 

• trig identities. 

Time to refresh your memory •… 

2.1 Basics 

The first thing I want to remind you about is the notion of radians. Instead of 
saying that there are 360 degrees in a full revolution, we’ll say that there are 2n 
radians. This may seem a bit wacky, but there is a reason: the circumference 
of a circle of radius 1 unit is units. In fact, the arc length of a wedge of 
this circle is exactly the angle of the wedge: 



This picture is pretty and all, but the main thing is to be comfortable with 
the most common angles in both degree and radian form. First, you should 
become absolutely comfortable with the idea that 90° is the same as n/2 
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radians, and similarly that 180° is the same as 7r radians and 270° is the same 
as 37t/2 radians. Once you have that in mind, try to be comfortable converting 
all the angles in the following picture back and forth between degrees and 
radians: 


90。 =- 



More generally, you can also use the formula 


◎ 


if you need to. For example, to see what 5 兀 /12 radians is in degrees, solve 
57T 


12 _ 180 


x angle in degrees 


to see that 57r/12 radians is the same as (180/ 丌 ） x (57r/12) = 75°. In fact, 
you can think of this conversion from radians to degrees as a sort of change 
of units, like changing from miles to kilometers. The conversion factor is that 
7r radians is the same as 180 degrees. 

We have only looked at angles so far, so let’s move on to trig functions. 
Obviously you have to know how the trig functions are defined in terms of 
triangles. Suppose you have a right-angled triangle and one of the angles, 
other than the right angle, is labeled 6 , like this: 



adjacent 


Then the basic formulas are 


sin(0)= 


opposite 

hypotenuse 


cos(0)= 


adjacent 
hypotenuse ? 


and 


tan(0)= 


opposite 

adjacent 
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Of course, if the angle 0 is moved, then the opposite and adjacent have to be 
moved as well: 



The opposite is, unsurprisingly, opposite the angle 6 and the adjacent is next 
to it. The hypotenuse never changes, though: it is the longest side and is 
always across from the right angle. 

We’ll also be using the reciprocal functions esc, sec, and cot, which are 
defined as follows: 

— sec ⑷ and cot( " )= tJw- 

Now, a piece of advice if you ever plan to take a calculus exam (or even 
if you don’t!): learn the values of the trig functions at the common angles 
0, 7r/6, 7t/4, 7t/3, and 7r/2. For example, without thinking, can you simplify 
sin(7r/3)? How about tan(7r/4)? If you can’t, then at best you’re wasting 
time trying to use a triangle to find the answer, and at worst you’re throwing 
away easy points by not simplifying your answer all the way. The solution is 
to memorize the following table: 



0 i 

7T 

4 

7T 7T 

3 2 


n 1 

1 


sin 

° 2 


T 1 

cos 

'今 

1 

■ « 

tan 

0 忐 

1 

V^3 * 


The star means that tan(7r/2) is undefined. In fact, the tan function has a 
vertical asymptote at 7r/2 (this will be clear from the graph, which we’ll look 
at in Section 2.3 below). Anyway, you need to be able to quote any of the 
entries in this table, both forward and backward! What this means is that 
you have to be able to answer two types of questions. Here are examples of 
each of these types: 

1. What is sin(7r/3)? (Using the table, the answer is V3/2.) 

2. What angle between 0 and 7r/2 has a sine equal to v^/2? (The answer 
is obviously 7t/3.) 
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Of course, you have to be able to answer these two types of questions for each 
entry in the table. Please, please, I beg of you, learn this table! Math isn’t 
about memorization, but there are a few things that are worth memorizing 
and this table is definitely on the list. So make flash cards, get your friends 
to quiz you, spend one minute a day, whatever works for you, but learn the 
table. 

2.t _4ending the D^fnpin of Trig Functions 

The above table (did you learn it yet?) only covers some angles ranging from 
0 to 7t/2. It’s possible to take sin or cos of any angle at all, even a negative 
one. For tan, we have to be a little more careful —— for example, we saw above 
that tan(7r/2) is undefined. Still, we’ll be able to take tan of just about every 
angle, even most negative ones. 

Lefs first look at angles between 0 and 2n (remember that 2n is the same 
as 360°). Suppose you want to calculate sin(0) (or cos(0), or tan(0)), where 
0 is between 0 and 2 丌 . To see what this even means, start by drawing a 
coordinate plane with some slightly weird labels: 


2 



3tt 

T 


Notice that the axes divide the plane into four quadrants, which are creatively 
labeled from 1 to 4 (in Roman numerals), and that the labeling goes coun¬ 
terclockwise. These quadrants are called the first, second, third, and fourth 
quadrants, respectively. The next step is to draw a ray (that’s half a line) 
starting at the origin. Which ray? It depends on 6. Just imagine yourself 
standing at the origin, looking to the right along the positive a:-axis. Now 
turn counterclockwise an angle of then march forward in a straight line. 
Your trail is the ray you’re looking for. 

Now the other labels on the above picture (and the one on page 26) make 
a lot of sense. Indeed, if you turn an angle of 7r/2, you are facing up the page 
and you trace out the positive y-axis as you walk along. If you had instead 
turned an angle of 7r, you’d get the negative : r-axis; and if you had turned 
3 丌 /2, you’d get the negative y-axis. Finally, if you had turned 2 冗 , that would 
put you back to where you started, facing along the positive 工 -axis. It’s the 
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same as if you hadn’t turned at all! That’s why the picture says 0 = 2 丌 . As 
far as angles are concerned, 0 and are equivalent. 

OK, let’s take some angle 6 and draw in the appropriate ray. Perhaps it 
might be somewhere in the third quadrant, like this: 



Notice that we label the ray as 0, not the angle itself. Anyway, now we pick 
some point on the ray and drop a perpendicular from that point to the ar-axis: 



We’re interested in three quantities: the x- and y-coordinates of the point 
(which are called x and of course!) and also the distance from the point 
to the origin, which is called r. Note that x and y could both potentially 
be negative — in fact, they both are negative in the above picture — but r is 
always positive, since it’s a distance. In fact, by Pythagoras’ Theorem, we 
have r = y/x 2 + y 2 , regardless of the signs of x and y. (The squares kill off 
any minus signs around.) 

Armed with these three quantities, we can define the three trig functions 
as follows: 





Trigonometry 


These are just the regular formulas from Section 2.1 above, with the quantities 
x, y, and r interpreted as the adjacent, opposite, and hypotenuse, respectively. 
But wait, you say ― what happens if you choose a different point on the ray? 
It doesn’t matter, because your new triangle will be similar to the old one 
and the above ratios are unaffected. In fact, it is often convenient to assume 



So the angle 7 兀 /6 is in the third quadrant. We’ve chosen the point on the 
ray which has distance r = 1 from the origin, then dropped a perpendicular. 
We know from the above formulas that sm(6) = y/r = y (since r = 1), so we 
really need to find y. Well, that little angle between our ray at 7 丌 /6 and the 
negative rc-axis — which itself is at 丌一 must be the difference between these 
two angles, 7t/6. The little angle is called the reference angle. In general, the 
reference angle for 6 is the smallest angle between the ray which represents 6 
and the $-axis. It must be between 0 and 7t/2. In our example, the closest 
route to the a:-axis is up, so the reference angle looks like this: 
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So in the little triangle, we know that r = 1 and the angle is 7r/6. It looks 
like y = sin(7r/6) = 1/2, except that can’t be right! Since we’re below the 
x-axis, the quantity y must be negative. That is, y = —1/2. Since sm(6) = y, 
we have shown that sin(77r/6) = —1/2. We can also repeat this with cosine 
instead of sine to see that x = — cos(7r/6) = —\/3/2. After all, x has to be 
negative, since the point (x^y) is to the left of the y-axis. This shows that 
cos(77r/6) = —y/3/2 and we have identified our point (x, y) as (—V^/2, —1/2). 

2.2.1 The ASTC method 

The key in the previous example is that sin(77r/6) is related to sin(7r/6)，where 
7r/6 is the reference angle for 7 丌 /6. In fact, it’s not hard to see that the sine 
of any angle is plus or minus the sine of the reference angle! This narrows 
it down to just two possibilities, and there’s no need to mess around with x, 
y, or r. So in our example, we just needed to find that the reference angle 
for 77 t/ 6 is 7r/6; this immediately told us that sin(77r/6) is equal to either 
sin(7r/6) or — sin(7r/6) and we just had to make sure we got the correct one. 
We saw that it was the negative one because y was negative. 

Actually, the sine of anything in the third or fourth quadrant must be 
negative because y is negative there. Similarly, the cosine of anything in the 
second or third quadrant must be negative, since x is negative there. The 
tangent is the ratio yjx, which is negative in the second and fourth quadrants 
(since one, but not both, of x and y is then negative) but positive in the first 
and third quadrants. 

Let’s summarize these findings in words as well as with a picture. First, all 
three functions are positive in the first quadrant (I). In the second quadrant 
(II), only sin is positive; the other two functions are negative. In the third 
quadrant (III), only tan is positive; the other two functions are negative. 
And finally, in the fourth quadrant (IV), only cos is positive; the other two 
functions are negative. Here’s what it all looks like: 


7T 

2 


II 

I 

sin + 

sin + 

cos- 

COS + 

tan— ^ 

, tan + 

S 

A 



T 

C 

sm — 

sm — 

cos — 

COS + 

tan + 

tan — 

III 

IV 


37T 

T 



In fact, the letters ASTC on the diagram are all you need to remember. They 
show you which of the functions are positive in that quadrant. “A” stands for 
“All,” meaning all the functions are positive in the first quadrant; the other 
letters obviously stand for sin, tan, and cos, respectively. In our example, 
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77t/ 6 is in the third quadrant, so only tan is positive there. In particular, sin 
is negative, so since we had narrowed the value of sin(77r/6) down to 1/2 or 
—1/2, it must be the negative possibility: indeed, sin(77r/6) = —1/2. 

The only problem with the ASTC diagram is that it doesn’t really tell 
you how to handle the angles 0, 7r/2, 7r, or 3 丌 /2, since they lie on the axes. 
In this case, it’s best to forget all about the ASTC stuff and draw a graph 
of y = sin(a:) (or cos (a:) or tan(a:), as appropriate) and read the value off the 
graph. We’ll discuss this in Section 2.3 below. 

Meanwhile, here’s a summary of the ASTC method for finding trig func¬ 
tions of angles between 0 and 27r: 


1. Draw the quadrant diagram, decide where in the picture the angle you 
care about is, and then mark that angle in the diagram. 

2. If the angle you want is on the x- or y-axis (that is, not within any 
quadrant), draw a graph of the trig function and read the value off the 
graph (there are some examples in Section 2.3 below). 

3. Otherwise, find the smallest angle between the one we want and the 
x-axis; this is called the reference angle. 



4. If you can, use the important table to work out the value of the trig 
function of the reference angle. That’s the answer you need, except that 
you might need a minus sign in front. 

5. Use the ASTC diagram to decide whether or not you need a minus sign. 

Let’s look at a couple of examples. How would you find cos(77t/4) and 
tan(97r/13)? We’ll look at them one at a time. For cos(77r/4), we notice that 
7/4 is between 3/2 and 2, so the angle must be in the fourth quadrant: 


2 



To work out the reference angle, notice that we have to go up to 2 丌 (not 
down to 0, beware!) so the reference angle is the difference between 2n and 
77t/4, which is (27r — 77r/4) or simply 7r/4. So, cos(77t/4) is plus or minus 
cos(7r/4), which is l/\/2 according to our table. Is it plus or minus? The 
ASTC picture says that cos is positive in the fourth quadrant, so it’s plus: 
cos(7tt/4) = 1/V2. 
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Now let’s look at tan(97r/13). We see that 9/13 is between 1/2 and 1, so 
the angle 9 兀 / 13 is in the second quadrant: 


reference 


III 3; IV 

T 

This time we have to go up to 7r to get to the 工 -axis, so the reference angle 
is the difference between 7r and 9 丌 /13, which is 7r — 97r/13 or simply 47r/13. 
So, we know that tan(97r/13) is plus or minus tan(47r/13). Alas, the number 
47r/13 isn’t in our table, so we can’t simplify tan(47r/13). We also need to 
work out whether it’s plus or minus. Well, the ASTC diagram shows that 
only sin is positive in the second quadrant, so tan must be negative there and 
we see that tan(97r/13) = — tan(47r/13). That’s as simplified as we can get 
without approximating. When solving calculus problems, I don’t recommend 
approximating the answer unless you are explicitly asked to. A common 
misconception is that the number that comes out on the calculator when you 
calculate something like — tan(47r/13) is the actual answer. On the contrary, 
it’s just an approximation! So you shouldn’t write 

-tan(47r/13) = -0.768438861, 

since it’s just not true. Instead, just leave it as — tan(47r/13) unless you are 
specifically asked for an approximation. In that case, use the approximately- 
equal symbol and fewer decimal places, rounding appropriately (unless you 
are asked for more): 

-tan(47r/13) ^ -0.768. 

By the way, you should rarely need to use a calculator — in fact, some colleges 
don’t even allow them in exams! So you should try to avoid the temptation 
ever to use one. 

2,2.2 ； ttig functions auteide [0, 2 tt ] 

There’s still the question of how to take trig functions of angles bigger than 
27 t or less than 0. In fact this isn’t so bad: simply add or subtract multiples 
of 2 丌 until you get between 0 and 2n. You see, it doesn’t just stop at 2tt. It 
just keeps on wrapping around. For example, if I asked you to stand on the 
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spot facing due east and then turn around counterclockwise an angle of 450 
degrees, it would be reasonable to assume that you’d turn a full revolution 
and then an extra 90 degrees. You’d be facing due north. Sure, you’d be 
a little dizzier than if you just did a 90-degree counterclockwise turn, but 
you’d be facing the same way. So 450 degrees is an equivalent angle to 90 
degrees, and of course the same sort of thing is true in radians: in this case, 
57 t/ 2 radians is an equivalent angle to 7 t/ 2 radians. But why stop at one 
revolution? How about 9 丌 /2 radians? That’s the same as going around 2 丌 
twice (which gets us up to 47r) and then an extra 7r/2, so we’ve done 2 useless 
revolutions before our final 7r/2 twist. The revolutions don’t matter, so once 
again 9 丌 /2 is equivalent to 7 t/ 2. This procedure can be extended indefinitely 
to get a whole family of angles which are equivalent to 丌 /2: 

7T 57T 9n 137T 177T 

Of course, each angle is a full revolution, or 2 丌 ， more than the first one. 
Still, that’s not the full story: if I’m going to insist that you do all these 
counterclockwise revolutions and get that dizzy, you might as well ask to be 
allowed to do a clockwise revolution or two to recover. This corresponds to a 
negative angle. In particular, if you were facing east and I asked you to turn 
—270 degrees counterclockwise, the only sane interpretation of my bizarre 
request is to turn 270 degrees (or 3 兀 /2) clockwise. Evidently you’ll still end 
up facing due north, so —270 degrees must be equivalent to 90 degrees. Indeed, 
adding 360 degrees to —270 degrees just gives us 90 degrees. In radians, we 
see that — 3 丌 /2 is an equivalent angle to n/2. In addition, we could insist on 
more negative (clockwise) full revolutions. In the end, here is the complete 
set of angles which are equivalent to n/2: 


◎ 

◎ 


157T 

丁， 


llTT 

丁， 


3 丌 7T 57T 9 丌 137T 177T 


The sequence has no beginning or end; when I say it’s “complete,” I’m glossing 
over the fact that there are infinitely many angles included in the dots at the 
beginning and the end. We can avoid the dots by writing the collection in set 
notation as {tt/2 + 27m}, where n runs over all the integers. 

Let’s see if we can apply this. How would you find sec(157r/4)? The first 
thing to note is that if we can find cos(157t/4), all we need to do is take the 
reciprocal in order to get sec(157r/4). So let’s find cos(157r/4) first. Since 15/4 
is more than 2, let’s try lopping off 2 from it. Hmm, 15/4 — 2 = 7/4, which is 
now between 0 and 2, so that looks promising. Restoring the 7r, we see that 
cos(157r/4) is the same as cos(77r/4) which we already saw is equal to l/\/2. 
So, cos(157r/4) = l/\/2. Taking reciprocals, we see that sec(157r/4) is just 


Finally, how about sin(—57r/6)? There are several ways of doing this 
problem, but the way suggested above is to try to add multiples of 2n to 
—57r/6 until we are between 0 and 2tt. In fact, adding 2n to —57r/6 gives 
77t/6, so sin(— 57t/6) = sin(77r/6), which we already saw is equal to —1/2. 
Alternatively, we could have drawn a diagram directly: 
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Now you have to work out the reference angle from the diagram, and it’s not 
too hard to see that it is 丌 /6 and continue as before. 


2.3 '|h 笱 Gra phs of Trig :.舞 jncti 辦 s 

It’s really useful to remember what the graphs of the sin, cos, and tan func¬ 
tions look like. These functions are all periodic, meaning that they repeat 
themselves over and over again from left to right. For example, consider 
y = sin (: r). The graph from 0 to 2n looks like this: 



You should be able to produce this graph without thinking, including the 
positions of 0, 7r/2, 7r, 3 丌 /2, and 27r. Since sin(a:) repeats every units (we 
say that sin(a:) is periodic in x with period 2 冗 ), we can extend the graph by 
repeating the pattern: 



Just reading values off the graph, we can see that sin(37r/2) = —1 and 
sin(—7r) = 0. As noted earlier, this is how you should deal with multiples 
I of 7t/2; no need to mess around with reference angles. Another thing to note 
is that the graph has 180° point symmetry about the origin, which means 
that sin(a:) is an odd function of x. (We looked at odd and even functions in 
Section 1.4 of the previous chapter.) 
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There are relations between trig functions which will come in handy. First, 
note that tan and cot may be expressed in terms of sin and cos as follows: 

tan(x) = cot(a;)= 

cos(o:) 

(Sometimes it’s helpful to replace every instance of tan or cot by sin and cos 
using these identities, but you shouldn’t really do this unless you’re stuck.) 

The most important of all the trig identities is Pythagoras’ Theorem (writ¬ 
ten in trig form), 

[cos 2 (a:) + sin 2 (o?) = 1.1 

This is true for any x. (Why is this Pythagoras 5 Theorem? If the hypotenuse 
of a right-angled triangle is 1 and one of the angles is x, convince yourself 
that the other two sides of the triangle have lengths cos(x) and sin ⑷ .） 

Now divide this equation by cos 2 (a:). I want you to check that you end up 

with _ 

1 + tan 2 (a:) = sec 2 (a;). 

This also comes up a lot in calculus. Alternatively, you could have divided 
the Pythagorean equation above by sin 2 (x) to get 

cot 2 ⑷ + 1 = esc 2 (a:). 


cos ⑷ 
sin(a:) 


This equation seems to come up less frequently than the others. 

There are some more relationships between trig functions. Have you no¬ 
ticed that some of the names begin with the syllable “co” ？ This is short for 
the word “complementary.” To say that two angles are complementary means 
that they add up to 7r/2 (or 90 degrees). It does not mean that they are nice 
to each other. All puns aside, the fact is that we have the following general 
relationship: 


trig function(:r) = co-trig function ($-$)• 

So in particular, we have 

sin(a:) = cos (g _ $) ， tan(o:) = cot (■ _ $) ， an d sec(a;) = esc (| _ $) • 

It even works when the trig function is already a “co” ； you just have to realize 
that the complement of a complement is the original angle! For example, co- 
co-sine is really just sine, and co-co-tan is just tan. Basically this means that 
we can also say that 

cos(a:) = sin (吾 _ a:) , cot (a;) = tan ( 吾 一 $) ， an d csc(a:) = sec (^ — • 

Finally, there’s another group of identities which are worth learning. These 
are the identities involving sums of angles and the double-angle formulas. 
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Specifically, you should remember that 

sin(A - B) = sin ⑷ cos(5) + cos(A) sin(B) 
cos(A B) = cos ⑷ cos(B) — sin ⑷ sm(B). 


It’s also useful to remember that you can switch all the pluses and minuses 
to get some related formulas: 

sin(A — B) = sin(A) cos(B) — cos(A) sin(B) 
cos (A — B) = cos ⑷ cos(B) + sin ⑷ sin(B). 

A nice consequence of the sin(A + B) and cos (A + B) formulas in the box 
above is obtained by letting A = B = x. It’s clear that the sine formula is 
sin(2a:) = 2sin(a:) cos (: c)，but let’s take a closer look at the cosine formula. 
This becomes cos(2a:) = cos 2 (a:) — sin 2 (x); true as this is, it is more useful to 
use the Pythagorean identity cos 2 (:r)+sin 2 ⑷ =1 to express cos(2a:) as either 
2 cos 2 (x) — 1 or 1 — 2 sin 2 ⑷ (convince yourself that these are both valid!). In 
summary, the double-angle formulas are 



sin(2:r) = 2 sin(x) cos(ar) 

cos(2a:) = 2 cos 2 (a:) — 1 = 1 — 2 sin 2 (a:). 

So, how would you write sin(4a:) in terms of sin(a;) and cos(a:)? Well, think of 
4x as double 2x and use the sine identity to write sin(4a:) = 2 sin(2a:) cos(2$). 
Then use both identities to get 


sin (4a:) = 2(2sin(x) cos (a:) )(2 cos 2 ⑷ —1) = 8 sin(a:) cos 3 (a:) — 4 sin(a:) cos (a:). 


Similarly, 

cos(4$) = 2 cos 2 (2a:) — 1 = 2(2 cos 2 (a:) — l) 2 — 1 = 8 cos 4 (a:) — 8 cos 2 (a:) + 1. 

You shouldn’t memorize these last two formulas; instead, make sure you un¬ 
derstand how to derive them using the double-angle formulas. 

Now, if you can master all the trig in this chapter, you will be in very 
good shape indeed for the rest of the book. So don’t leave it till too late — get 
cracking on a bunch of examples and make sure you learn the table and all 
the boxed formulas! 








CHAPTER 3 


Introducticin to Limits 


Calculus wouldn’t exist without the concept of limits. This means that we 
are going to spend a lot of time looking at them. It turns out that it’s pretty 
tricky to define a limit properly, but you can get an intuitive understanding of 
limits even without going into the gory details. This will be enough to tackle 
differentiation and integration. So, this chapter contains only the intuitive 
version; check out Appendix A for the formal version. All in all, here’s what 
we’ll look at in this chapter: 

• an intuitive idea of what a limit is; 

• left-hand, right-hand, and two-sided limits, and limits at oo and —oo; 

• when limits fail to exist; and 

• the sandwich principle (also known as the “squeeze principle”）. 

3.1 Limits: Jfe.6asic Idea; 

Let’s dive in. We start with some function / and a point on the x-axis ， 
which we’ll call a. Here is what we’d like to understand: what does f(x) look 
like when x is really really close to a, but not equal to a? This is a pretty 
strange question to ask，which is probably why it took until relatively recently 
for humankind to develop calculus. 

Here’s an example showing why we might want to ask this question. Let 
f have domain R\{2} (all real numbers except for 2), and set f(x) = x — 1 
on this domain. Formally, you might write: 

f(x) = x — 1 when x ^2. 

This seems like a weird sort of function: after all, why on earth would we 
want to exclude 2 from the domain? Actually, in the next chapter, we’ll see 
that / arises quite naturally as a rational function (see the second example 
in Section 4.1). In the meantime, let’s just take / for what it is and sketch a 
graph of it: 
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y = / ⑷ 



What is /(2)? Perhaps you’d like to say that /(2) = 1, but that would be 
a load of bull since 2 isn’t even in the domain of /. The best you can do is 
to say that /(2) is undefined. On the other hand, we can find the value of 
f(x) when x is really really close to 2 and see what happens. For example, 
/(2.01) = 1.01, and /(1.999) = 0.999. If you think about it, you can see that 
when x is really really close to 2, the value of f(x) is really really close to 1. 

What’s more, you can get as close as you want to 1, without actually 
getting to 1, by letting x be close enough to 2. For example, if you want f(x) 
to be within 0.0001 of 1, you could take any x between 1.9999 and 2.0001 
(except of course for a: = 2, which is forbidden). If you instead wanted f(x) 
to be within 0.000007 of 1, then you’d have to be a little more picky about your 
choice of x — this time you’d need to take x between 1.999993 and 2.000007 
(except for x = 2^ once again). 

Anyway, these ideas are described in much greater detail in Section A.l of 
Appendix A. Without getting bogged down, let’s cut to the chase and just 
write 



If you read this out loud, it should sound like “the limit, as x goes to 2, of 
f(x) is equal to 1.” Again, this means that when x is near 2 (but not equal 
to it), the value of f(x) is near 1. How near? As near as your heart desires. 
Another way of writing the above statement is 

f(x) — 1 as a: — 2. 



This is harder to do computations with, but its meaning is quite clear: as x 
journeys along the number line from the left or the right toward the number 
2, the value of f(x) gets very very close to 1 (and stays close!). 

Now, let’s take the above function / and modify it slightly. Indeed, suppose 
that a new function g has the following graph: 




y = 
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The domain of g is all real numbers, and g(x) can be defined in piecewise 
fashion as 

— 1 if x ^ 2, 

\ix = 2. 

What is lim 2 ^(a:)? The trick here is that the value of ^(2) is irrelevant! It’s 
only the values of g(x) where x is close to 2, not actually at 2, which matter. 
Ignoring x = 2, the function g is identical to the function / we looked at 
earlier. So, lim 2 ^(a:) = 1 as before, even though g(2) = 3. 

Here’s an important point: when you write something like 

=1， 

the left-hand side isn’t actually a function of x\ Remember, the equation 
means that f(x) is close to 1 when x is close to 2. We could actually replace 
x by any other letter and this would still be true. For example, f(q) is close 
to 1 when q is close to 2, so we have 

= 1 . 

We can go nuts with this and also write 

lim/(6) ^ 1, = 1, lim/(a) = 1, 

and so on until we run out of letters and symbols! The point is that in the 
limit 

=1， 

the variable x is just a dummy variable. It is a temporary label for some 
quantity that is (in this case) getting very close to 2. It can be replaced by 
any other letter, as long as you swap it out wherever else it appears; also, when 
you work out the value of the limit, the answer cannot include the dummy 
variable. So be smart about your dummy variables. 

3.t ； Left-Hand and Rig 嚷 Hand: Limits 

We’ve seen that limits describe the behavior of a function near a certain point. 
Think about how you would describe the behavior of h(x) near x = 3: 









y = h(x) 
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Of course, the fact that h(3) = 2 is irrelevant as far as the limiting behavior 
is concerned. Now, what happens when you approach x = 3 from the left? 
Imagine that you’re the hiker in the picture, climbing up and down the hill. 
The value of h(x) tells you how high up you are when your horizontal position 
is at x. So, if you walk rightward from the left of the picture, then when your 
horizontal position is close to 3, your height is close to 1. Sure, there’s a sheer 
drop when you get to a: = 3 (not to mention a weird little ledge floating in 
space above you!), but we don’t care about this for the moment. Everything 
to the right of x = 3, including a: = 3 itself, is irrelevant. So we’ve just seen 
that the left-hand limit of h(x) at a: = 3 is equal to 1. 

On the other hand, if you are walking leftward from the right-hand side 
of the picture, your height becomes close to —2 as your horizontal position 
gets close to x = 3. This means that the right-hand limit of h(x) at a; = 3 is 
equal to —2. Now everything to the left of a: = 3 (including x = 3 itself) is 
irrelevant! 

We can summarize our findings from above by writing 

lim h(x) = 1 and lim h(x) = —2. 
x—^S~ x—^S+ 

The little minus sign after the 3 in the first limit above means that the limit 
is a left-hand limit, and the little plus sign in the second limit means that it’s 
a right-hand limit. It’s important to write the minus or plus sign after the 3, 
not before it! For example, if you write 

lim h(x), 
x—^—S 



then you are referring to the regular two-sided limit of h(x) at a: = —3, not 
the left-hand limit at a; = 3. These are two very different animals indeed. 
By the way, the reason that you write x ^ 3~ under the limit sign for the 
left-hand limit is that this limit only involves values of x less than 3. That is, 
you need to take a little bit away from 3 to see what’s going on. In a similar 
manner, when you write x > 3 + for the right-hand limit, this means that you 
only need to consider what happens when you add a little bit onto 3. 

Now, limits don’t always exist, as we’ll see in the next section. But here’s 
something important: the regular two-sided limit a,t x = a exists exactly 
when both left-hand and right-hand limits dX x = a exist and are equal to 
each other! In that case, all three limits — two-sided, left-hand, and right- 
hand — are the same. In math-speak, I’m saying that 


lim f(x) = L and lim f(x) = L 

x—^a~ x—^a^ 

is the same thing as 

lim f(x) = L. 

x—^a 

If the left-hand and right-hand limits are not equal, as in the case of our 
function h from above, then the two-sided limit does not exist. We’d just 
write 

lim h(x) does not exist 

or you could even write “DNE” instead of “does not exist.” 
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3.3 When the Limit Does Not Exist 



We just saw that a two-sided limit doesn’t exist when the corresponding left- 
hand and right-hand limits are different. Here’s an even more dramatic ex¬ 
ample of this. Consider the graph of f(x) = 1/x: 



What is lim 0 /⑷？ It may be a bit much to expect the two-sided limit to 
exist here, so let’s first try to find the right-hand limit, lim^ f(x). Looking at 
the graph, it seems as though f(x) is very large when x is positive and close 
to 0. It doesn’t really get close to any number in particular as x slides down 
to 0 from the right; it just gets larger and larger. How large? Larger than 
anything you can imagine! We say that the limit is infinity, and write 

lim — = oo. 

$—•0+ x 

Similarly, the left-hand limit here is —oo, since f(x) gets arbitrarily more and 
more negative as x slides upward to 0. That is, 


lim — = —oo. 

x 



The two-sided limit certainly doesn’t exist, since the left-hand and right-hand 
limits are different. On the other hand, consider the function g defined by 
g{x) = 1/x 2 . The graph looks like this: 


















Section 3.4: Limits at oo and —oo • 47 

Limits at oc and -oo 

There is one more type of limit that we need to investigate. We’ve concen¬ 
trated on the behavior of a function near a point x = a. However sometimes it 
is important to understand how a function behaves when x gets really huge. 
Another way of saying this is that we are interested in the behavior of a 
function as its argument x goes to oo. We’d like to write something like 

lim f(x) = L 

x—^oo 

and mean that f(x) gets really close, and stays close, to the value L when x 
is large. (More details can be found in Section A.3.3 of Appendix A.) The 
important thing to realize is that writing = L” indicates that the 

graph of / has a right-hand horizontal asymptote at y = L. There is a similar 
notion for when x heads toward — oo: we write 

f(x) = L, 

which means that f(x) gets extremely close, and stays close, to L when x gets 
more and more negative (or more precisely, —x gets larger and larger). This 
of course corresponds to the graph of y = f(x) having a left-hand horizontal 
asymptote. You can turn these definitions around if you like and say: 


“/ has a right-hand horizontal asymptote at y = L” 
means that^lmi^ f(x) = L. 

“f has a left-hand horizontal asymptote a,t y = M” 
means that^Um^/^) = M. 

Of course, something like y = x 2 doesn’t have any horizontal asymptotes: the 
values of y just go up and up as x gets larger. In symbols, we can write this 
as^lin^a: 2 = oo. Alternatively, the limit may not even exist. For example, 
what isjjp^sin ⑷？ Well, what value is sin ⑷ getting closer and closer to 
(and staying close)? It’s just oscillating back and forth between —1 and 1, so 
it never really gets anywhere. There’s no horizontal asymptote, nor does the 
function wander off to oo or — oo; the best you can do is to say that lim sin(x) 
does not exist (DNE). Again, see Section A.3.4 of Appendix A for a proof of 
this. 

Let’s return to the function / given by f(x) = sin(l/a:) that we looked at 
in the previous section. What happens when x gets very large? Well, when 
x is large, 1/x is very close to 0. Since sin(0) = 0, it should be true that 
sin(l/rr) is also very close to 0. The larger x gets, the closer sin(l/a;) is to 0. 
My argument has been a little sketchy but hopefully you’re convinced that* 

lim sin(l/x) = 0. 

x^oo 

So sin(l/$) has a horizontal asymptote at y = 0. This allows us to extend the 
graph of y = sin(l/ar) that we drew above, at least to the right. We should 


*If not, see Section A.4.1 of Appendix A! 
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still worry about what happens when 怎 < 0. This isn’t too bad, since / is an 
odd function. Here’s why: 


f(-x) = sin ( 士 ) 




=_/ ⑷. 


Note that we used the fact that sin ⑷ is an odd function of x to get from 
sin(—1/a:) to — sin(l/ar). So, since odd functions have that nice symmetry 
about the origin (see Section 1.4 in Chapter 1), we can complete the graph of 
y = sin(l/a:) as follows: 



Again, it’s hard to draw what happens for x near 0. The closer x is to 0, the 
more wildly the function oscillates, and of course the function is undefined at 
a: = 0. In the above picture, I chose to avoid the black smudge in the middle 
and just leave the oscillations up to your imagination. 


3.4.1 -toiQ#numbers arid 她捕 numbefB 

I hope we can all agree that 1,000,000,000,000 is a large number. So how about 
— 1,000,000,000,000? Perhaps controversially, I want you to think of this as 
a large negative number rather than a small number. An example of a small 
number would be 0.000000001, while —0.000000001 is a small number too — 
more precisely, a small negative number. Funnily enough, we’re not going to 
think of 0 itself as being small: it’s just zero. So our informal definition of 
large numbers and small numbers looks like this: 



• A number is large if its absolute value is a really big number. 

• A number is small if it is really close to 0 (but not actually equal to 0). 

Although the above definition will serve us well in practice, it’s a really 
lame definition. What do I mean by “really big” and “really close to 0” ？ Well, 
consider the limit equation 


lim f(x) = L. 

x—^oo 

As we saw above, this means that when a: is a large enough number, the value 
of f(x) is almost L. The question is, how large is “large enough ”？ It depends 
on how close to L you want f(x) to be! Still, from a practical point of view, 
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a number x is large enough if the graph oi y = f(x) starts looking like it’s 
getting serious about snuggling up to the horizontal asymptote at y = L. Of 
course, everything depends on what the function / is, as you can see from the 
following picture: 


\y = f(x) I \ y = /⑷ 



10 100 200 110 100 200 


In both cases, /(10) is nowhere near L. In the left-hand picture, it looks 
like f(x) is pretty close to L when x is at least 100, so any number above 100 
would be large. In the right-hand picture, /(100) is far away from L, so now 
100 isn’t large enough. You probably need to go up to about 200 in this case. 
So can’t you just pick a number like 1,000,000,000,000 and say that it’s always 
large? Nope — a function might wander around until 5,000,000,000,000 before 
it starts getting close to its horizontal asymptote. The point is that the term 
“large” has to be taken in context, relative to some function or limit. Luckily, 
there’s plenty of room up above — even a number like 1,000,000,000,000 is 
pretty puny compared to 10 100 (a googol), which itself is chicken feed in 
comparison with 10 looooo °, and so on. By the way, we’ll often use the term 
“near oo” in place of “large and positive.”（A number can’t really be near oo 
in the literal sense, since oo is so far away from everything. The term “near 
oo” makes sense, though, in the context of limits as a: ^ oo.) 

Of course, all this also applies to limits as x — oo, except that you just 
stick a minus sign in front of all the large positive numbers above. In this case 
we’ll sometimes say “near — oo” to emphasize that we are referring to large 
negative numbers. 

On the other hand, we’ll often be looking at limit equations of the form 
lim /(a:) = I/, lim+ f(x) = L or lim f(x) = L. 

In all three of these cases, we know that when x is close enough to 0, the 
value of f(x) is almost L. (For the right-hand limit, x also has to be positive, 
while for the left-hand limit, x has to be negative.) Again, how close does x 
have to be to 0? It depends on the function /. So, when we say a number is 
“small” (or “near 0” ）， we’ll have to take this in the context of some function 
or limit, just as in the case of “large.” 

Although this discussion really tightens up the above lame definition, it’s 
still not perfect. If you want to learn more, you should really check out 
Sections A.l and A.3.3 in Appendix A. 
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3.5 Two Common Misconceptions about Asymptotes 

Now seems like a good time to correct a couple of common misconceptions 
about horizontal asymptotes. First, a function doesn’t have to have the same 
horizontal asymptote on the left as on the right. In the graph of f(x) = 1/x 
on page 45 above, there is a horizontal asymptote at y = 0 on both the 
right-hand side and the left-hand side — which means that 

lim — = 0 and lim —— = 0. 

cc^oo X x—^oo X 

However, consider the graph oiy = tan 一 1 (x) (or if you prefer, y = arctan(ar) —— 
this is the inverse tangent function and you can write it either way): 



This function has a right-hand horizontal asymptote at y = 7r/2 and a left- 
hand horizontal asymptote at y = —7r/2; these are not the same. We can also 
express this in terms of limits: 


lim tan -1 (a:)=— 


and 


lim tan-1($)=—— 


So a function can indeed have different right- and left-hand horizontal asymp¬ 
totes, but there can be at most two horizontal asymptotes — one on the right 
and one on the left. It might have none or one: for example, y = 2 X has a 
left-hand horizontal asymptote but not a right-hand one (see the graph on 
page 22). This is in contrast to vertical asymptotes: a function can have as 
many of those as it feels like (for example, y = tan(x) has infinitely many). 

Another common misconception is that a function can’t cross its asymp¬ 
tote. Perhaps you have learned that an asymptote is a line that a function 
gets closer and closer to without ever crossing. This just isn’t true, at least 
when you’re talking about horizontal asymptotes. For example, consider the 
function / given by f(x) = sm(x)/x, where for the moment we only care 
about what happens when x is positive and large. The value of sin(a:) oscil¬ 
lates between —1 and 1, so the value of sin(a:)/a: oscillates between the curves 
y = —1/x and y = 1/x. Also, sin(a:)/a; has the same zeroes as sin(a;) does, 
namely 7r, 2 丌 , 37t, .... Putting it all together, the graph looks like this: 
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The curves y = 1/x and y = —1/x, which are drawn as dotted curves in the 
graph, form what’s called the envelope of the sine wave. In any event, as you 
can see from the graph, if there’s any justice in the world, then it should be 
true that 

lim ^=0. 

x—^oo X 

This means that the o;-axis is a horizontal asymptote for /, even though the 
graph of y = f(x) crosses the axis over and over again. Now, to justify the 
above limit, we’ll need to apply something called the sandwich principle. The 
justification is at the end of the next section. 


The Sandwich Principle 

The sandwich principle, also known as the squeeze principle, says that if a 
function / is sandwiched between two functions g and h that converge to the 
same limit L as a: ^ a, then / also converges to L as a: — a. 

Here’s a more precise statement of the principle. Suppose that for all 
x near a, we have g(x) < f(x) < h(x). That is, f(x) is sandwiched (or 
squeezed) between g(x) and h(x). Also, let’s suppose thatjnn a p(a:) = L and 
\im a h(x) = L. Then we can conclude that lim a f(x) = L\ that is, all three 
functions have the same limit as a: — a. As usual, the picture tells the story: 









except this time the inequality g(x) < f(x) < h(x) only ] 
the side of a that you care about. For example, what is 

lim a: sin 

x^-0+ 

The graph of y = a:sin(l/a:) is similar to that of y = sin(l, 
the factor of x which causes the function to be trapped be 
oi y = x and y = —x. Here’s what the graph looks like f 
0.3: 




We still have the wild oscillations as x goes to 0 from ； 
now damped by the envelope lines. In particular, finding 
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envelope line y = —x, and the function h is the upper envelope line y = x. 
We need to show that g(x) < f(x) < h(x) for a: > 0. We don’t care about 
a: < 0 since we only need the right-hand limit of f(x) at x = 0. (Indeed, if 
you extend the lines to negative x, you can see that g{x) is actually greater 
than h(x) for rr < 0, so the sandwich is the wrong way around!) So, how do 
we show that g(x) < f(x) < h(x) when a: > 0? We’ll use the fact that the 
sine of any number (in our case, 1 /x) is between — 1 and 1 inclusive: 

- “o 


Now multiply this inequality through by x, which is cool because a: > 0; we 
get 


—x < x sin 



< x. 


But this is precisely g(x) < f(x) < h(x), which is what we need. Finally, note 
that 


lim g(x) = lim (—x) = 0 and lim h(x) = lim x = 0. 

x—>0 + x—^0+ x—^0+ cc—^0+ 

So, since the values g(x) and h{x) of the sandwiching functions converge to 
the same number, 0, as a: —> 0 + , so does f(x). That is, we’ve shown that 

lim xsin ( — ) =0. 
a ： 一 o+ \x ) 

Remember, this certainly isn’t true without the factor x out front; the limit 
of sin(l/:r) as a: — 0+ does not exist, as we saw in Section 3.3 above. 

We still haven’t resolved the issue of justifying the limit from the end of 
the previous section! Remember, we wanted to show that 

lim —=0_ 

X^oo X 

To do this, we have to invoke a slightly different form of the sandwich principle, 
involving limits at oo. In this case we need g{x) < f(x) < h(x) to be true 
for all large x\ then if we know that^lirn^ g{x) = L and^lirn^ h(x) = L, we 
can also say that^lir^/(a;) = L. This is almost the same as the sandwich 
principle for finite limits. To establish the above limit, we again use the fact 
that —1 < sin(a:) < 1 for all x, but this time we divide by x to get 

」 < sin(a:) ^ 1 
X ~ X ~ X 

for all x > 0. Now let a: ^ oo; since both —1/x and 1/x have 0 as their limit, 
the same must be true for sin(x)/a:. That is, since 

lim —— = 0 and lim — = 0, 

cc—oo X x—^oo X 


lim ㈣ =0_ 


we must also have 
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In summary, here’s what the sandwich principle says: 


If g{x) < f(x) < h(x) for all x near a, 
and Jim^ g(x) = Jim^ h(x) = L, then 
lhnf(x)=L. 


This also works for left-hand or right-hand limits; in that case, the inequality 
only has to be true for x on the appropriate side of a. It also works when 
a is oo or —oo; in that case, the inequality has to be true for x really large 
(positively or negatively, respectively). 

3.t" Summary of Basic Types of Limits 

We have looked at a whole bunch of different basic types of limits. Let’s fin- 
H/fl ish this chapter with some representative diagrams showing the most common 
™ I possibilities: 

1. The right-hand limit at x = a. Behavior of f(x) to the left of x = a, and 
at x = a itself, is irrelevant. (This means that it doesn’t matter what values 
f(x) takes for x < a, as far as the right-hand limit is concerned. In fact, f(x) 
need not even be defined for x < a.) 



2. The left-hand limit at x = a. Behavior of f(x) to the right of x = a, and 
at x = a itself, is irrelevant. 









right-hand limits e: 
In the second picti 
limit exists and is f 





CHAPTER 4 


How to Solve Limit Problems Involving Polynomials 


In the previous chapter, we looked at limits from a mostly conceptual view¬ 
point. Now it’s time to see some of the techniques used to evaluate limits. 
For the moment, we’ll concentrate on limits involving polynomials; later on 
we’ll see how to deal with trig functions, exponentials, and logarithms. As 
we’ll see in the next chapter, differentiation involves taking limits of ratios, so 
most of our focus will be on this type of limit. 

When you’re taking the limit of a ratio of two polynomials, it’s really im¬ 
portant to notice where the limit is being taken. In particular, the techniques 
for dealing with x ^ oo and x ^ a (for some finite a) are completely different. 
So, we’ll split up our plan of attack into limits involving the following types 
of functions: 

• rational functions as x —> a; 

• functions involving square roots as a: — a; 

• rational functions as a; —> oo; 

• ratios of polynomial-like (or “poly-type”）functions as a: ^ oo; 

• rational functions/poly-type functions as a: —> —oo; and 

• functions involving absolute values. 


4.1 Limits Involving Rational Functions osx ^ a 


Let’s start off with limits that look like this: 


lim 


p{^) 

q(xy 



where p and q are polynomials and a is a finite number. (Remember that 
the quotient p{x)/q(x) of two polynomials is called a rational function.) The 
first thing you should always try is to substitute the value of a for x. If the 
denominator isn’t 0, then you’re in good shape —— the value of the limit is just 
what you get when you substitute. For example, what is 




_ 3x + 2 q 
~^ 2 ~• 
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Simply plug x = -l into the expression (x 2 -Sx-\-2)/(x — 2), and you get 

(-l) 2 -3(-l) + 2 6 0 

- ^2 - = 巧 =_ 2 . 

The denominator isn’t 0, so —2 is the value of the limit. (I know that I said in 
the previous chapter that the value of the function at the limit point, which 
is a; = —1 in this case, is irrelevant; but in the next chapter we’ll look at the 
concept of continuity, which will justify this “plugging-in” method.) 

On the other hand, if you want to find 


then plugging in a: = 2 won’t work so well: you get (4 — 6 + 2)/(2 — 2)，which 
simplifies down to 0/0. This is called an indeterminate form. If you use the 
plugging in method and get zero divided by zero, then anything could happen: 
the limit might be finite, the limit might be oo or —oo, or the limit might 
not exist. The above example can be solved by the important technique of 
factoring everything in sight. In particular, x 2 — 3x 2 can be factored as 
(x — 2) (x — 1)，so we can write 

x 2 -3x-\-2 (x-2)(x- 1) 

M —— = , lim . —— = hm S x - !) 


by canceling. Now there’s no impediment to plugging x = 2 into the ex¬ 
pression (x — 1); you just get 2 — 1, which equals 1. That’s the value of our 
limit. 

This brings us to a point which is often misunderstood: are the two func¬ 
tions / and g defined by 


and g(x) = x 


the same function? Why can’t you say that 

" 、 _ 3x 2 (^x _ 2) (^x _ 1) 


Well, you almost can! The only problem is when x = 2, because then the 
denominator (x — 2) is equal to 0 and that doesn’t make sense. So / and g 
are not the same function: the number 2 is not in the domain of / but it is in 
the domain of g. (We’ve actually encountered this function / before — check 
out the discussion and graph at the beginning of Chapter 3.) On the other 
hand, if you put limits in front of everything in the above chain of equations, 
it all becomes correct because the values of f(x) and g(x) at a; = 2 don’t 
matter —— it’s only the values of f(x) and g(x) near x = 2 that count. So the 
solution of the previous limit problem is indeed valid. 

Let’s look at another example of an indeterminate form. Again, the tech¬ 
nique is to try to factor everything in sight. In addition to knowing how to 
factor quadratics, it’s really useful to know the formula for the difference of 
two cubes: 
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Here’s a harder example where you need to use this formula: find 
x 3 -27 

^3x 4 -5x 3 + 6x 2 ' 


If you plug in a; = 3, you indeed get 0/0 (try it and see). So let’s try to factor 
both the numerator and the denominator. The numerator is the difference 
between x 3 and 3 3 , so we can use the boxed formula above. The denomi¬ 
nator has an obvious factor of a: 2 , so it can be written as x 2 (x 2 — 5x -h 6). 
The quadratic x 2 — 5x + 6 can also be factored; altogether, then, you should 
convince yourself that we have 

X s - 27 _ (x - 3)(x 2 + 3x + 9) 

x^s x 4 — bx 3 + 6x 2 x 2 (x — 3)(x — 2) • 

Substituting x = 3 doesn’t work because of the factor of (x — 3) in the de¬ 
nominator. On the other hand, since we are taking limits, we only need to see 
what happens when x is near 3; so we are perfectly justified in canceling out 
the factors of (x — 3) from the numerator and denominator — they are never 
equal to 0. So, using the plugging-in technique after factoring and canceling, 
the whole solution looks like this: 

_ 2T i. {x _ H - 3x + 9) -\- 3x H - 9 

x 4 - 5x 3 + 6x 2 = x™ x 2 (x - 3)(x - 2) = x™ x 2 (x - 2) 

3 2 + 3- 3 + 9 o 
3 2 (3-2) =3 ' 


What if the denominator is 0 but the numerator isn’t 0? In that case, 
there’s always a vertical asymptote involved; that is, the graph of the rational 
function will have a vertical asymptote at the value of x that you’re interested 
in. The problem is that there are four types of behavior that could arise. In 
each of the following diagrams, / is the rational function we care about, and 
the various limits a,t x = a are shown under the picture: 




\y = f(x) 

) 

Y y = fix) 

/ 

y = f{x) 


y = /(») 



a 


a 


a 

I 

\ 

a 

/ 


lim+ZW = 00 

lini_ f(x) = —oo 
lim/(x) DNE 


1™+/^) =°o 
I™- f(x) = oo 
lim/(a；) = oo 


1™+/^) = -oo 
lini_ f(x) = oo 
DNE 


1151+/(a；) = -00 
]im- f(x) = -00 
]im/(a；) a—_ 


So, how do you tell which of the four cases you’re dealing with? You just 
have to explore the sign of f(x) on either side oi x = a. If it’s positive on 
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both sides, for example, then you must be in the second case above. Here’s 
an actual example: how would you find 

lim ^x 2 -x-6 ? 
x^l x(x — l ) 3 


First, plugging in a: = 1 gives —5/0 (try it!). So we must be dealing with one of 
the four cases above. Which one? Let’s set f(x) = (2x 2 — x — 6)/(x(x — l) 3 ) 
and see what happens when we move x around near 1. The first thing to 
notice is that the numerator 2a; 2 — x — 6 is actually equal to —5 when x = 1, 
so when we wobble x around a little bit, the numerator will stay negative. 
How about the factor of x in the denominator? When x = 1, this factor is of 
course 1, which is positive — and it stays positive when you move x around a 
bit. The crucial factor is (x — l) 3 . This is positive when x > 1 but negative 
when x < 1. So we can summarize the situation like this (using (+) and (—) 
to denote positive and negative quantities, respectively, and of course using 
the fact that (_)•(_) = (+) and so on): 

when x > 1 : (+). (+) = (-)； when a: < 1 : (+) . ㈠ = 

That is, f(x) is negative when a: is a little greater than 1, but positive when x 
is a little less than 1. Look up at the four pictures above — the only one that 
works is the third figure. In particular, we can see that the two-sided limit 


lim 

x^l 


2x 2 — x — 6 
x(x — l) 3 


does not exist, but the one-sided limits do (although they are infinite); in 
particular, 


2x 2 -x-6 


◎ 


and 


lim 


2x 2 -x-6 


x^l+ x(x — l) 3 x^l- x(x — l) 3 

Now suppose we change the limit slightly to 
2x 2 -x-6 


lim 


x(x — l) 2 


How does that change anything? Well, the numerator is still negative when x 
is near 1, and the factor x is still positive, but how about (x — l) 2 ? Since it’s 
a square, it must be positive when x is near but not equal to 1. So we now 
have the following situation: 


when x > 


= (-)； 


when x < 


‘（+)_(+) 

Now we have negative values on either side of a:= 
2a: 2 — x — 6 


= (-)• 


(+)_(+) 

so we must have 


x^i x(x — l) 2 


Of course, the left- and right-hand limits are both equal to —oo as well. 
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◎ 


Limits Involving Square Roofs osx ^ a 


Consider the following limit: 


\Jx 2 — 9 — 4 
x—^5 x — 5 


If you plug in a: = 5, you get the indeterminate form 0/0 (try it and see!). 
Trying to factor everything in sight doesn’t work so well — you can write x 2 — 9 
as (: r — 3)($+3)， but that doesn’t really help because of that blasted —4 in the 
numerator. What you need to do is multiply and divide by \/x 2 — 9 + 4; this 
is called the conjugate expression of Vx 2 —9 — 4. (You have probably already 
met conjugate expressions in your math studies, especially when rationalizing 
the denominator. The basic idea is that the conjugate expression of a — 6 is 
a + 6, and vice versa.) So, here’s what we get when we do this multiplication 
and division: 

Vx 2 — 9 — 4 _ y/x 2 — 9 — 4 y/x 2 — 9 + 4 
x-^5 x — 5 x-^5 x — b y/x 2 — 9 + 4 

This looks more complicated, but something nice happens: using the formula 
(a — 6)(a + b) = a 2 — 6 2 , the numerator simplifies to (y/x 2 — 9) 2 — 4 2 , or simply 
x 2 — 25. So the above limit is just 

r a: 2 -25 

(x — b)(y/x 2 — 9 + 4) 

Factor x 2 — 25 as (x — 5)(x + 5) and cancel to see that this limit becomes 

v (ar_5)(:r + 5) .. ar + 5 

x ~^ 5 (x — 5)(Vx 2 — 9 + 4) \/x 2 — 9 + 4 

Now if you substitute x = 5, there are no problems: you simply get 10/8, or 
5/4. The moral of the story is that if you have a square root plus or minus 
another quantity, try multiplying and dividing by its conjugate — you might 
be pleasantly surprised! 


4.3 Limits Involving Mional Functions as x ^ cx) 


OK, back to rational functions, but this time we’ll look at what happens as 
x oo instead of some finite value. In symbols, we are now trying to find 
limits of the form 


Hm ^ 
^oo q[x) 


where p and q are polynomials. Now, here’s a very important property of a 
polynomial: when x is large, the leading term dominates. What this 
means is that if you have a polynomial p, then as x gets larger and larger, 
p(x) behaves as if only its leading term were present. For example, let’s say 
p(x) = 3x s — 1000x 2 5x — 7. Let’s put Pl(x) = 3x s , which is the leading 







Does this make sense? Why is it the leading term, anyway? Why i 
of the other terms? If you want, you can skip to the next paragraph i 
the mathematical proof; first, however, I’d like to get a feel for what h 
in our example, p(x) = 3x s — lOOOx 2 + 5a: — 7, by testing it on actuf 
values of x. Let’s start off with x = 100. In that case, 3a: 3 is 3 millior 
lOOOx 2 is 10 million. The quantity 5x is only 500, and the 7 doesn’ 
much difference, so all together we can see that p(100) is about —7 milli 
the other hand, pl( 100) is 3 million, so it’s not looking so great: p(l( 
^(100) are completely different. Let’s not lose heart —— after all, 100 isi 
large. Suppose we instead set x equal to 1,000,000 — that’s a million, 
3a: 3 is freakin’ huge: it’s 3,000,000,000,000,000,000, or three million 1 
In comparison, lOOOx 2 is relatively puny at only a thousand trillion 
1,000,000,000,000,000) and 5a: is only 5 million, which is a microscopic 
of dust in comparison to these numbers. The —7 term is just laughal 
makes no noticeable difference. So, to calculate p(l,000,000), we need 
3 million trillion and take away a thousand trillion plus some spare chi 
little under 5 million). Let’s face it, it’s still darned close to 3 million 1 
After all, how many trillions are we dealing with here? We have 3 mi 
them, and we’re taking away a mere one thousand of them, so we sti 
almost 3 million trillions. That is, p(l,000,000) is about 3 million trillio 
that is exactly the value of pl (1,000,000). The point is that the highest 
term is growing much faster than the other terms as x gets large. Inc 
you replace 1,000,000 with an even larger number, the difference betv 
and the lower order terms like x 2 and x becomes even more pronounc 
Enough philosophical rambling. Let’s try to give a real proof that 

= 1 . 


lim 樂 

p L (x) 


We have to do some actual math. Start off by writing 
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which simplifies to 




1000 5 

3x 3x 2 


i) 


How do you handle this? Well, the first thing to note is that you can bust up 
this last expression into four separate limits. So if you know what happens 
to the four quantities 1, —1000/3a;, 5/3ar 2 ，and — 7/3:r 3 as becomes very 
large, then you can just add the four limits together to get the limit you 
want. Technically, this could be described in words as “the limit of the sum 
is equal to the sum of the limits ”； this is true when all the limits are finite.* 
So, we have four quantities to worry about. The first is 1, which is always 1 
regardless of what happens to x. The second quantity is —1000/3^. What 
happens to this when x gets large? That is, what is 


1000 r 


The trick here is to realize that you can take out a factor of —1000/3. In 
particular, the limit can be expressed as 


1000 

~ 3 ~\ 


The cool thing about something like —1000/3 is that it’s constant. It doesn’t 
change, no matter what x is, so it turns out that you can just go ahead and 
drag it out of the limit (see Section A.2.2 of Appendix A for more details). 
So we have 

1000 1 1000 v 1 

lim --- = --- lim —. 

x—^oo 3 X 3 x—^oo X 

We’ve already seen that the reciprocal of a very large number is a very small 


number (remember, this means a number very close to zero). So^lin^l/a; = 0, 
and —1000/3 times the limit is also 0. The conclusion is that 


lim —— 


In fact, you should just write that down without going into any more detail. 
More generally, you can use the following theorem: 


lim — = 0 

x—^oo x n 


for any n > 0, as long as C is constant. This fact allows us to see that the 
other two pieces, 5/3x 2 and —7/3x s , also tend to 0 as a: becomes very large. 
So the whole argument is 


3x 3 - lQQQx 2 + 5x - 7 
3x 3 




= 1 - 0 + 0 + 0 = 


*It’s not true if the limits aren’t finite! Consider x (x + (1 — x)). For any x, it’s true 
that (a; + (1 — a;)) = 1, so this limit is just 1. On the other hand, the individual limits of 
the two pieces (x) and (1 — x) are lirn (x) and lirn (1 — x). The first limit is oo and the 
second is —oo, but it’s not true that oo + (—oo) = 1. In fact, the expression oo + (—oo) is 
meaningless. 
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So we have proved that 

lim _ pW_ = 1 

x^oo leading term of p(x) 

in the special case where p(x) = Sx 3 - 1000a; 2 5x — 7. Luckily the same 
method works for any polynomial, and we’ll be using it over and over again 
during the rest of this chapter! 


4.3.1 Method and examples 




◎ 


Here’s the general idea: when you see p(x) for some polynomial p with more 
than one term, replace it by 


P{x) 


leading term of p(x) 


x (leading term of p(x)). 


Do this for every polynomial around! Note that all we’ve done is to divide 
and multiply by the leading term, so we haven’t changed the quantity p(x). 
The point is that the fraction in the expression above has limit 1 as a: — oo, 
and the leading term is much simpler. Let’s see how this works in practice: 
for example, what is 


lim 


-8a: 4 


7a: 4 + 5a; 3 H- 2000x 2 — 6 • 


We have two polynomials: one on the top and one on the bottom. For the 
numerator, the leading term is —8x 4 (don’t be fooled by the order in which 
the numerator is written — the leading term isn’t always written first!). So 
we’re going to replace the numerator by 


8a: 4 


-8a: 4 


(-8a； 4 )- 


Similarly, the denominator has leading term 7a; 4 , so we’ll replace it by 
7a; 4 + 5a; 3 + 2000a; 2 - 6 


7 a: 4 

Making both these replacements leads to this: 
x-8x 4 


x (7a; 4 ). 


lim 


lim 


V-8X 4 

-8a; 4 


(-8x 4 ) 


x^oo 7x 4 H- 5a: 3 + 2000a; 2 — 6 ^oo 7x 4 + 5x 3 + 2000a; 2 


7a: 4 


x (7a: 4 ) 


Looking at this, you should concentrate on the ratio 

-8a: 4 
7 :r 4 ， 

because that’s what’s really going on here. The other fractions all have limit 
1, but we have effectively squeezed all the important juice out of our two 
polynomials into the simple ratio of leading terms. Luckily that ratio just 
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simplifies to —8/7, so that should be our answer. To nail that down, we have 
to prove that the other fractions have limit 1, but that’s no problem. You 
see, in each of the little fractions, we can do the division and we see that our 
above limit can be written as 


lim 


Sx 3 


i + A + ^ 

lx 7x 2 


6 

7^ 


-Sx 4 
7x 4 • 


Now we take limits; from the fact in the box in the previous section, any 
expression of the form C/x n goes to 0 as a: — oo (provided that C is constant 
and n > 0). So most of the stuff goes away! We also cancel out the x 4 factor 
on the right to see that we are reduced to 


0 + 


-8 


-8 


+ 0 + 0-0 


and we’re all done. 

Here’s another example: find 


(x 4 + Sx-99)(2-x b ) 
(18a; 7 H- 9a; 6 — 3a: 2 — 1)($ + 1) • 


We have four polynomials here, with leading terms x 4 , —x 5 , lSx 7 , and x. So 
we’ll use our method for each one of them! Try it and see for yourself before 
reading further. Even if you don’t, make sure you understand every step of 
the argument below: 


(x 4 + 3x-99)(2-x 5 ) 

〉 (18x 7 + 9x 6 — 3x 2 — l)(x + 1) 
f x 4 -\-3x — 99 


=lim 




(-X 5 ) 


fl8x 7 + 9x 6 -3x 2 


=lim ■ 

x—^oo 


lSx 7 

_3__ 

x s 


- x (18，)) (3^ X ⑷; 


S (- 


_7_ x (^Kz^) 

+u— 丄 ） u (18 珊 


18^ _ 18a: 5 _ 18a : 7 八 
(1 + 0-0)(0+1) -x -x 

-^ 18 =-°° 


(1 + 0 - 0 - 0)(1 


0) X Is 


The main point is that we boiled out the leading terms into the ratio 


(18x 7 )(x) J 

which simplifies to —x/lS. Everything else had no effect! Finally, when 
x ^ oo, the quantity —x/lS goes to —oo, so that’s the “value” of the limit 
we’re looking for. 
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In the previous two examples, we’ve seen that the limit might be finite and 
nonzero (we got the answer —8/7) or it might be infinite (we got the answer 
—oo). Let’s look at the degree of the polynomials in these examples. In the 
first example, both the numerator and the denominator were of degree 4. In 
the second example, the numerator is the product of polynomials of degree 
4 and degree 5， so if you multiply it out, you get a polynomial of degree 
9. Similarly, the denominator is the product of polynomials of degree 7 and 
degree 1, so it has total degree 8. In this case, the numerator is of greater 
degree than the denominator. On the other hand, consider this limit: 


lim 


2x + 3 
x 2 — 7' 


Let’s use our methods to solve it: 


2$ + 3 


2a:+ 3 
x 2 -7 


lim 


x (2a:) 


x ( " 2) 

-±^x lim '=0. 
— 0 x—^oo x 


lim 

x^oo 



X 


2x 


Here, the denominator has degree 2, which is greater than the numerator’s 
degree (which is 1). The result is that the denominator dominates, so the 
limit is 0. In general, here’s what we can say considering the limit 

lim 

x-^-oc 

where p and q are polynomials: 

1. If the degree of p equals the degree of g, the limit is finite and nonzero. 

2. If the degree of p is greater than the degree of q, the limit is oo or —oo. 

3. If the degree of p is less than the degree of g, the limit is 0. 

(All this is also true when x —> —oo, so that the limit is 


p(x) 

(l{x) 


lim 


p ㈤ . 
q{x)' 


we’ll consider this case in Section 4.5 below.) These facts are easily proved in 
general using the above methods. Useful as these facts are, you really don’t 
need them to solve problems; you should use the dividing and multiplying 
method, then use the facts to check that your answer makes sense. 


4.4 Limits Involving Poly-type Functions as x —>• oo 

Consider functions /, g and h defined by 

f(x) = x s 4a: 2 — 5a: 2 / 3 + 1, g(x) = \/x 9 — lx 2 + 2, 
and h(x) = x 4 — y x s \/x 2 -2x-\-3. 
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These aren’t polynomials because they involve fractional powers or nth roots, 
but they look a little like polynomials. In fact, the methods of the previous 
section work on these objects as well, so I’ll call them “poly-type functions.” 

The principles for poly-type functions are similar to those for polynomials, 
except that this time it may not be so clear what the leading term is. The 
presence of square roots (or cube roots, fourth roots, and so on) can have a 
big impact on this. For example, let’s consider 

..Vl6x 4 + 8 + 3a: 

lim — 0 —— --- . 

cc—oo 2a: 2 + 6$ + 1 

The bottom is a polynomial with leading term 2a; 2 , so we’ll replace it by 

2a: 2 + 6$ + 1 
2^2 


How about the top? The part under the square root is a polynomial, 16a; 4 + 8, 
and its leading term is 16x 4 . If you take the square root of that, you get 4a: 2 . 
So mentally you should think of the top as behaving like 4x 2 -\-Sx. The leading 
term of that is 4x 2 , so that’s what we’re going to use. Specifically, we will 
replace the top by 

Vl6x4 4 ^ 8 + 3X X (4^ 2 ). 

How do you simplify the first fraction? The answer is that you can drag the 
4a: 2 under the square root, and it becomes 16x 4 : 


\/16x 4 + 8 H- 3x _ Vl6x 4 H- 8 3a; _ / 16x 4 + 8 3a; 

4x 2 4x 2 + 4a; 2 V 16a; 4 + 4a: 2 ' 

Now if you split up more and cancel, you can reduce this to 


r 3 

V 1 + l6^ + 4^* 

As a: ^ oo, the parts with x on the bottom just go away, so this expression 
goes to 

\/rTo + o = i. 

So, let’s put it all together and write out the solution to the original problem: 


Vl6x 4 + 8 + 3x 
_ 4x^ _ 


x (4a: 2 ) 


a^oo 2a: 2 + 6$ + : 


: r-^oo 2x + 6a: + : 


2a: 2 


x (2a: 2 ) 


/16a: 4 + 8 3x_ 

V 16x 4 + 4x 2 v 仏 2 


lim 、 l ^ x \ 

x^oo 2x + 6$ + 


2a: 2 


lim 


8 3 _ 

16a; 4 + 


2a: 2 


6 1 

H - 1 - 

^ 2a: ^ 2a: 2 


\/rTo + o 


1 + 0 + 0 
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Nice, huh? Messy, but nice. Now let’s see what happens when we modify the 
situation very slightly. Consider 

..Vl6a: 4 + 8 + 3a: 3 

lim — —^ —— --- . 

x^oo 2a: 2 + 6$ + 1 

The only change is that the 3x term in the numerator in the previous example 
has become 3x s . How does this affect things? Well, we already said that the 
\/l6x 4 + 8 term behaves like Ax 2 for large x, but this time it gets swamped 
by the higher-degree term 3a: 3 . So now we have to replace the top by 

VlQx 4 + 8 + 3a: 3 o. 


of course, when we drag 3x s under the square root, it will become 9x 6 . All 
together, then, the solution looks like this: 


Vl6x 4 + 8 + 〔 
2x 2 + 6$ + : 


Vl6x 4 + 8 - 
3x 3 

2x 2 H- 6x - 


_^ 2 ^^ 2a; 2a; 2 

■\/0 + 0 + 1 3x 

=— --——— x lim — = oo. 

1 + 0 + 0 x-^-oo 2 

Make sure you understand each step of the last two solutions. In the first 
example, the leading term came from the 16x 4 under the square root; even 
when you take the square root, the resulting term 4x 2 still dominated the 
rest of the numerator (3a:). In the second example, the rest of the numerator 
(3a; 3 ) was the dominant force. But wait, you say — what if they are the same? 
For example, what is 

v V4x Q -bx 5 -2 x 3 0 
v27x^ + 8^ 

The denominator isn’t too nasty, actually, but let’s just look at the numerator 
for a second. Under the square root, we have 4x 6 — 5a: 5 , which behaves like 
its leading term 4x 6 when x is large. So we should think that V4x 6 — 5x 5 
behaves like V^x 6 , which is just 2a: 3 (since x is positive). The problem is 
that we are taking away 2a: 3 in the numerator, so it looks like we’re left with 
nothing! Crap. What do we do? 

The solution is to use the same technique as described in Section 4.2 above: 
multiply top and bottom by the conjugate expression of the numerator. So 
before we even look at leading terms, we need to do some prep work: 


Now the formula (a — 6) (a + 6) = a 2 - 


\/27x 6 + Sx V 4a: 6 — 5a: 5 + 2a: 3 
b 2 allows us to simplify this whole thing 
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Pull out the quantities _5x 5 , 3a: 2 , and 4a: 3 to get 


(+ 8x\ 

( - 5x 5 + 2a; 3 \ 


l 4a; 3 ) 


-5a: 5 

(3x 2 )(4a: 3 ) 



Now all you have to do is cancel x b from the top and the bottom and use the 
arguments from above to show that the final answer is —5/12. I’ve left you 
with a bit of work, but you should try to assemble all the bits and pieces from 
above into a complete solution. 


4.5.. Limits Involving Rational Functions'as x —oo 


Now let’s spend a little time on limits of the form 


lim 


p(x) 

咖 ）’ 



where p and q are polynomials or even poly-type functions. All the principles 
we’ve been using apply equally well here. When a: is a very large negative 
number, the highest-degree term in any sum still dominates. Also, it’s true 
that C/x n still goes to 0 as a: ^ —oo, provided that C is constant and n 
is a positive integer. (Can you see why?) This all means that the solutions 
are almost identical to what we’ve already seen. For example, consider some 
adaptations of two examples we’ve already looked at in Section 4.3.1 above: 

x — 8x 4 (x 4 -\-Sx — 99)(2 — x 5 ) 

xi-oo7x 4 + 5x 3 + 2000x 2 -6 xii-oo (18a: 7 + 9a: 6 - 3a: 2 - l)(x + 1). 

All I’ve done is change oo to —oo, so that we are now interested in what 
becomes of the two rational functions when x is a very large negative number. 
The solution to the first one is the same as it was when x tended to oo; you 
just multiply and divide by the leading term of each polynomial: 


lim 


x — 8x 4 

7x 4 H- 5x 3 H- 2000x 2 — 6 


v-Sx 4 

-Sx 4 


(-8/) 


7x 4 + 5x 3 -f 2000x 2 - 6 
7x 4 


x (7a: 4 ) 



1 + A + 2000 _ A 
lx lx 2 7x A 


X 



8 

7. 


The point here is that any term that looks like C/x n for some positive n goes 
to 0 as a: ^ —oo, just the same as it does when x ^ oo. On the other hand, 
the second example is not quite identical; the very last step is different from 
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the previous version of the problem: 

(a: 4 + 3a:-99)(2-a: 5 ) 


lim 


-oo (18a: 7 + 9x 6 — 3a: 2 — l)(a: + 1) 


"x 4 + 3a: — 99 


x(o; 4 ) 


)( 穿 


x (-x 5 ) 


^-oc ( 18a; 7 + 9a: 6 -3x 2 - 


18x 7 


x (18a; 7 ) 


K 宇 


X {x) 


^ ~ (~ 


■— >-c» ( 9 3 

1 H - 

V 18x 18a; 5 

(1 + 0 - 0)(-0 + 1 ) 


18a: 7 


)0 


(x 4 )(~x 5 ) 
(18 $ 7 )⑷ 


(i + 0-0-0)(i + 0) x ^ m oo 芫气 too tI = 00 - 


Only when we take the limit at the very end do we see anything different from 
when a: — oo: as a: — —oo, now —x/18 goes to oo rather than —oo. 

There’s only one other thing you have to beware. We’ve been dragging 
factors into square roots without being too careful. To show you what I 
mean, try simplifying \fx^. Did you get xl That’s not right if x is negative, 
unfortunately. For example, if you square —2 and then take the square root, 
you will get 2. So in fact Vx^ = —x when x is negative. This sort of thing 
comes up when you look at poly-type limits as x —oo, for example: 


lim 

x—^—oo 


\/4x 6 + 8 
2x 3 + 6$ + 1 


The denominator behaves like its leading term 2x 3 , but how about the nu¬ 
merator? The term in the square root, 4x 6 + 8, behaves like 4x 6 , so V4x 6 H- 8 
behaves like V^x 6 . Tempting as it is to simplify this as 2x s , it is simply 
not correct! Since x —oo, we are interested in what happens when x is 
negative. This means that 2x 3 is negative, but y/^x 6 is positive, so we must 
simplify V4x 6 as —2a; 3 . Here’s how the solution goes: 


v + 8 
lim —^ - 

z：—-oo 2x 6 + + 


' 丁 丁 

2a: 3 


/4x 6 + 8 / 

V 4a: 6 \/4^ _ V + 4a: 6 

& 上 i = --oo 

— 2^^ ㉟ + 5 

1+0+0 v ) 


You have to exercise similar care when you deal with fourth roots, sixth roots, 
and so on. For example, 


= —x if x is negative. 
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The same would be true if you replaced every instance of 4 with any even 
number. On the other hand, it’s not true if you replace 4 by an odd number; 
for example, 

\/~x^ = x for all x (positive, negative, or zero). 

One other point, though: it’s still true that 
Vx^ = x 2 

even if a: < 0! Why? Because x 2 can’t be negative, and Vx^ can’t be negative 
by definition, so there can’t possibly be a minus sign! Here’s a summary of 
the situation: 


if a; < 0 and you want to write V a; something = x m , the only time you 
need a minus sign in front of x m is when n is even and m is odd. 


4.6 Limits Involving Absolute Values 


Sometimes you have to deal with functions involving absolute values. Consider 
this limit: 

lim M. 

cc^-0- X 

In order to answer this, let’s set f(x) = \x\/x and check it out some more. 
First, note that 0 can’t be in the domain of /, since the denominator would 
then be 0. On the other hand, everything else is fine. Let’s look at what 
happens when x is positive. The quantity |a:| is then just x, so we see that 
f(x) = 1 if a: is any positive number. On the other hand, if x is negative, 
then \x\ = —x, so f(x) = —x/x = — 1 if a: < 0. That is, writing f{x) = \x\/x 
is just a fancy way of saying that f(x) = 1 if a: > 0 and f(x) = —1 if a: < 0. 
The graph of y = f(x) looks like this: 


1( 

ki 



>-l 

looking at, you 

So, for the left-hand limit that we were 


from the left, and it’s clear that 



x—^o- x 














CHAPTER 5 


Continuity and Differentiability 


In general, there’s only one special thing about the graph of a function: it just 
has to obey the vertical line test. That’s not particularly exclusive. The graph 
could be all over the place — a little bit here, a vertical asymptote there, or 
any number of individual disconnected points wherever the hell they feel like 
being. So now we’re going to see what happens if we’re a little more exclusive: 
we want to look at two types of smoothness. First, continuity: intuitively, this 
means that the graph now has to be drawn in one piece, without taking the 
pen off the page. Second, differentiability: the intuition here is that there are 
no sharp corners in the graph. In both cases, we’ll do a lot better job with 
the definition, and we’ll see some of the things you can expect to get from 
functions with these special properties. In detail, this is what we’ll look at in 
this chapter: 

• continuity at a point, and over an interval; 

• some examples of continuous functions; 

• the Intermediate Value Theorem for continuous functions; 

• maxima and minima of continuous functions; 

• displacement, average velocity, and instantaneous velocity; 

• tangent lines and derivatives; 

• second and higher-order derivatives; and 

• the relationship between continuity and differentiability. 

5.1 Continuity 


We’ll start off by looking at what it means for a function to be continuous. 
As I said above, the intuition is that you can draw the graph of the function 
in one piece, without lifting your pen off the page. This is all very well for 
something like y = x 2 , which is all in one piece; but it’s a little unfair for 
something like y = 1/x. This would have had a graph in one piece except 
for the vertical asymptote at a: = 0, which breaks it into two. In fact, if 
f(x) = 1/x, then we want to say that / is continuous everywhere except at 
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x = 0. So we have to understand what it means to be continuous at a point, 
and then we’ll worry about continuity over larger regions like intervals. 


5.1.1 Continuity at a point 

Let’s start with a function / and a point a on the a:-axis which is in the domain 
of /. When we draw the graph of y = f(x), we don’t want to lift up the pen 
as we pass through the point (a, /(a)) on the graph. It doesn’t matter if we 
have to lift up our pen elsewhere, as long as we don’t lift it up near (a, /(a)). 
This means that we want a stream of points which get closer and 

closer — arbitrarily close, in fact — to the point (a,/(a)). In other words, as 
x ^ a, we need f(x) / ⑷ . Yes, ladies and gentlemen, we’re dealing with 
limits here. We can now give a proper definition: 

A function / is continuous a,t x = a if lim f(x) = /(a). 


Of course, for this last equation to make sense at all, both sides must be 
defined. If the limit doesn’t exist, then / isn’t continuous at x = a, whereas 
if f(a) doesn’t exist, then you’re totally screwed: there isn’t even a point 
(a, /(a)) to go through! So we can be a little more precise about the definition 
and explicitly require three things to be true: 

1. The two-sided limit lim f(x) exists (and is finite). 

2. The function is defined at a: = a; that is, /(a) exists (and is finite). 

3. The two above quantities are equal: that is, 



lim f(x) = /(a). 

Let’s see what happens if any of these properties fail. Consider the following 
graphs: 



In diagram #1, the left- and right-hand limits aren’t the same at x = a, so 
the two-sided limit doesn’t exist there; therefore the function isn’t continuous 
at x = a. In diagram #2, the left- and right-hand limits exist and are finite 
and equal to each other, so the two-sided limit exists; however the function 
isn’t even defined at x = a, so it isn’t continuous there. In diagram #3, the 
two-sided limit again exists, and the function is defined at x = a, but the limit 
isn’t the same as the function value; once again, the function isn’t continuous 
at x = a. On the other hand, the function in diagram #4 is indeed continuous 
at x = a, since the two-sided limit at x = a exists, /(a) exists, and the limit is 
the same as the value of the function. By the way, we say that the functions 
in the first three diagrams have a discontinuity at rr = a. 
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5.1.2 Continuity on an interval 

We now know what it means for a function to be continuous at a single point. 
Let’s extend this definition and say that a function / is continuous on the 
interval (a, b) if it is continuous at every point in the interval. Notice that 
f doesn’t actually have to be continuous at the endpoints x = a or x = b. 
For example, if f(x) = 1/x, then / is continuous on the interval (0,oo) even 
though /(0) isn’t defined. This function is also continuous on (—oo,0), but 
not on (—2,3), since 0 lies within that interval, and / isn’t continuous there. 

How about an interval like [a, 6]? We have to be a little more flexible. For 
example, below is the graph of a function with domain [a, 6]; we’d like to say 
that it’s continuous on [a, b ] : 



The problem is that the two-sided limits at the endpoints x = a and x = b 
don’t exist: we only have a right-hand limit oX x = a and a left-hand limit at 
x = b. That’s OK; we just modify our definition a bit by using the appropriate 
one-sided limits at the endpoints. So we say that a function / is continuous 
on [a, b] if 

1. the function / is continuous at every point in (a, 6); 

2. the function / is right-continuous a,t x = a. That is, \im a+ f(x) exists 
(and is finite), /(a) exists, and these two quantities are equal; and 

3. the function / is left-continuous at x = b. That is, lim f(x) exists (and 
is finite), f(b) exists, and these two quantities are equal. 

Finally, we just say that a function is continuous if it is continuous at all 
the points in its domain, with the understanding that if its domain includes 
an interval with a left and/or right endpoint, then we only need one-sided 
continuity there. 

5.1.3 Examples of oonitinuous functions: 

Many common functions are continuous. For example, every polynomial is 
continuous. This seems a little hard to prove, since there are so many different 
polynomials, but actually it’s not so bad. First, let’s prove that the constant 
function /, defined by f(x) = 1 for all x, is continuous at any point a. Well, 
we need to show that 

lim f(x) = f(a). 

x—^a 

Since f(x) = 1 for any x, and /(a) = 1, then this means that we need to show 
that 


lim 1 = 1. 
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Of course, this is obviously true, since nothing depends on x or a! Now, let’s 
set g(x) = x. Is g continuous? Well, now we need 

lim g(pc) = g(a). 

Since g(x) = x and g(a) = a, this reduces to showing that 


lim x = a. 



This is also obviously true: as x ^ a, well, x a\ Now we just need to observe 
that a constant multiple of a continuous function is continuous; also, if you 
add, subtract, multiply or take the composition of two continuous functions, 
you get another continuous function (see Section A.4.1 of Appendix A for 
more info). The same is almost true if you divide one continuous function 
by another: the quotient function is continuous everywhere except where the 
denominator is 0. For example, 1/x is continuous except at x = 0, since we’ve 
seen that both the numerator and denominator are continuous functions of x. 

Anyway, back to polynomials. Because g(pc) = a: is continuous in x, we 
can multiply g by itself to see that x 2 is also continuous in x. You can keep 
multiplying by x as often as you like to prove the continuity of any power of 
x (as a function of x). Then you can multiply by constant coefficients and 
add different powers together to get any polynomial — and everything’s still 
continuous! 

It turns out that all exponentials and logarithms are continuous, as are all 
the trig functions (except where they have vertical asymptotes). We’ll just 
take that for granted for the moment and return to this point in Section 5.2.11 
below. Meanwhile, I want to look at a more exotic function. Consider the 
function / defined by f(x) = xsin(l/x). We looked at the graph of this (at 
least when a; > 0) in Section 3.6 of Chapter 3. In fact, it’s really easy to extend 
the graph to a: < 0, because / is an even function. Why? Remembering that 
sin ⑷ is an odd function of x, we have 


f(-x) = (-x) sin = (-x) (_sin ( 垂 ))=^sin ( 臺 ) = /($). 

So / is indeed even, and we can get the graph of all of / by reflecting the pre¬ 
vious graph using the y-axis as our mirror (the graph only shows the domain 
—0.3 < x < 0.3): 
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Now let’s consider the continuity of the function. As a function of x, we 
know that 1/a; is continuous away from a; = 0; now compose this with the 
sine function, which is also continuous, and you can see that sin(l/a;) is also 
continuous away from x = 0. Now you just have to multiply sin(l/a;) by x 
(which is obviously a continuous function of a:!) to see that / is continuous 
everywhere except at a: = 0. 

Now, what happens at a: = 0? Clearly / is not continuous at a: = 0, since 
it’s not even defined there (there’s a hole in the graph). Let’s plug up this 
hole by defining a function g as follows: 

f xsin ( — ] if a: — 0, 

9(^) = < W 

[0 if ^ = 0. 

So g(x) = f(x) everywhere except at a: = 0, where g equals 0 but / is un¬ 
defined. As a result, g is automatically continuous everywhere / is — namely, 
everywhere except x = 0 — but now we need to see what happens at a: = 0. 
We have a hope because g(0) is defined. Also, we used the sandwich principle 
in Section 3.6 of Chapter 3 to show that 

lim q(x) = lim a: sin ( — ) =0. 
x^ 0 + x^0+ \x J 

By symmetry (or the sandwich principle, again), we can see that the left-hand 
limit is also equal to 0. So in fact the two-sided limit is 0 as well: 



So we have shown that 

lhn e ff(0) 

since both sides exist and are equal to 0. This means that g is actually 
continuous at a: = 0, even though it was cobbled together in piecewise fashion. 

We’re almost ready to look at two nice facts involving continuity; first I 
want to return to a point I made at the beginning of Chapter 4. The first 
example we looked at was 

_ 3x H - 2 

lim --- , 

x^-i x — 2 

which we solved by just substituting x = —1 to get the answer —2. Why is 
this justified? The argument seems to contradict the idea that the value of 
the above limit has nothing to do with what happens at x = — 1, only what 
happens near x = —1. This is where continuity comes in: it connects the 
“near” with the “at.” Specifically, if we let f(x) = (x 2 — 3x 2)/(x — 2), 
then since the numerator and denominator are polynomials, / is continuous 
everywhere except where the denominator is 0. That is, / is continuous 
everywhere except oX x = 2. So / is continuous at x = —1, which means that 

^lmi i f(x) = /(-l). 
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Replacing / by its definition, we have 


(- 1)-2 


That is the complete solution. In practice, few mathematicians would bother 
spelling it out in such gory detail, but it’s worth understanding what you’re 
doing whenever possible! 


5.1.4 The Intermediate Value Theorem 

Knowing that a function is continuous brings some benefits. We’re going to 
look at two such benefits. The first is called the Intermediate Value Theorem, 
or IVT for short. Here’s the idea: let’s suppose that a function / is continuous 
on a closed interval [a, 6]. Also suppose that f(a) < 0 and f(b) > 0. So in the 
graph of y = f(x), we know that the point (a, /(a)) lies below the x-axis and 
that the point (6, f(b)) lies above the x-axis, like this: 


Now, if you have to connect those two points with a curve (which of course 
has to obey the vertical line test), and you’re not allowed to lift your pen up, 
it’s intuitively obvious that your pen will have to cross the x-axis somewhere 
between a and 6, at least once. It could be close to a or close to 6, or somewhere 
in the middle; you might cross back and forth many times; but the critical 
thing is that you have to cross at least once. That is, there is an ^-intercept 
somewhere between a and b. It，s crucial that the function / is continuous at 
every point in [a, 6]; look what can happen if / is discontinuous at even one 
point: 





^ 6 


The discontinuity allows this function to jump over the x-axis without passing 
through it. So, we need continuity on the whole region [a, 6]. All this is also 
true if we start above the axis and end below it; that is, if f(a) > 0 and 
f(b) < 0, we must have an x-intercept somewhere in [a, b] if / is continuous 
on all of [a, 6]. Since an ^-intercept at c means that /(c) = 0, we can state 
the Intermediate Value Theorem as follows: 
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Intermediate Value Theorem: if / is continuous on [a, b ], and f(a) <0 
and f(b) > 0, then there is at least one number c in the interval (a, b) 
such that /(c) = 0. The same is true if instead f(a) > 0 and f(b) < 0. 


There’s a proof of this theorem in Section A.4.2 of Appendix A. For now, 
let’s look at a few examples of how to apply this theorem. First, suppose 
you want to show that the polynomial p(x) = —x 5 + 尤 4 + 3$ + 1 has an 
☆intercept between x = 1 and x = 2. All you have to do is notice that 
p is continuous everywhere (including [1,2]) because it’s a polynomial; also, 
calculate p(l) = 4 > 0 and p(2) = —9 < 0. Since p(l) and p(2) have opposite 
signs and p is continuous on [1 ， 2]，we know that there is at least one number 
c in the interval (1,2) such that p(c) = 0. This number c is an ^-intercept of 
the polynomial p. 

Here’s a slightly harder example. How would you show that the equation 
x = cos (a:) has a solution? You don’t have to find the solution, only to 
show that there is one. You could start by drawing the graphs oi y = x and 
y = cos(a:) on the same axes. If you do, you’ll find that the intersection of the 
graphs has x-coordinate somewhere around 7r/4. This graphical argument, 
while compelling, doesn’t cut it so far as a mathematical proof is concerned. 
How can we do better? 

The first step is to use a little trick: put everything onto the left-hand 
side. So, instead of solving x = cos(a:), we try to solve x — cos(rr) = 0. Now 
we must take the initiative and set f(x) = x — cos(x). We’ll be all done if we 
can show that there is a number c such that /(c) = 0. Let’s check that this 
makes sense: if /(c) = 0, then c — cos(c) = 0, so c = cos(c) and we have found 
a solution to the equation x = cos(a:), namely x = c. 

Now it’s time to use the Intermediate Value Theorem. We need to find 
two numbers a and b such that one of f(a) and f(b) is negative and the other 
one is positive. Since we think (from the graph) that the answer is around 
7r/4, we’ll be conservative and take a = 0 and b = tt/2. Let’s check the values 
of /(0) and /(7 t/ 2). First, /(0) = 0 — cos(0) = 0 — 1 = —1, which is negative, 
and second, /(7r/2) = 7r/2 — cos(7r/2) = n/2 — 0 = 7r/2, which is positive. 
Since / is continuous (it is the difference of two continuous functions), we 
can conclude by the Intermediate Value Theorem that /(c) = 0 for some c 
in the interval (0,7 t/ 2), and we have shown that x = cos(x) has a solution. 
We don’t know where the solution is, nor how many solutions there are — only 
that there is at least one solution in the interval (0,7r/2). (Note that the 
solution is not really at 7r/4! It’s not possible to find a nice expression for the 
answer, actually.) 

Here’s a small variation. So far, we have required that /(a) <0 and 
f(b) > 0 (or the other way around), then concluded that there’s a number 
c in (a, b) such that /(c) = 0. Instead, we can replace 0 by any number M 
and the result is still true. So, suppose / is continuous on [a, 6]; if /(a) < M 
and f(b) > M (or the other way around), then there is some c in (a, b) such 
that /(c) = M. For example, if f(x) = 3 X H- a: 2 , then does the equation 
f(x) = 5 have a solution? Certainly / is continuous; also we can guess to 
plug in 0 and 2, which leads to /(0) = 1 and /(2) = 13. Since the numbers 1 
and 13 surround the target number 5 (one is smaller and the other is bigger), 
the Intermediate Value Theorem tells us that /(c) = 5 for some c in (0,2). 
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I That is, f(x) = 5 does have a solution. Now, try to repeat the problem by 
starting with a new function g, where g(pc) = 3 X + a: 2 — 5. Convince yourself 
that if f(x) = 5 has a solution c, then this number c is also a solution of the 
equation g(x) = 0. Since g(0) < 0 and 夕 (2) > 0, you can use the previous 
method instead of the variation! In fact, the variation doesn’t really give us 
anything new 一 it just makes life a little easier sometimes. 


5.1.5 A harder IVT example 


◎ 


So when x gets very large, p(x) and a n x n are relatively close to each other 
(their ratio is near 1). This means that they at least have the same sign as 
each other! One can’t be negative and the other positive, or else their ratio 
would be negative, not close to 1. The same is true when a: is a very large 
negative number. 

So let’s suppose that A is a large negative number, so large that p(A) and 
a n A n have the same sign. Also, we’ll pick some huge positive number B so 
that p(B) and a n B n have the same sign. Now let’s compare the signs of a n A n 
and a n B n . Since n is an odd number, these must have opposite signs! One 
is negative and one is positive. For example, if a n > 0, then a n B n is positive 
and a n A n is negative. (This is only true because n is odd: if n were even then 
both quantities would be positive.) So here’s the situation: 

... same sign as . „ opposite sign to same sign as . _. 

p(A) <~~^ a n A n <~> a n B n <~~^ p(B). 

So p(A) and p(B) have opposite signs. Since p is a polynomial, it is continuous; 
by the Intermediate Value Theorem, there is a number c between A and B 
such that p(c) = 0. That is, p has a root, although we really have no idea 
where it is. That makes sense since we knew virtually nothing about p to 
start with, only that its degree was odd. 

5.1.6 Maxima and minima of continuous functions 

Let’s move on to the second benefit of knowing that a function is continuous. 
Suppose we have a function / which we know is continuous on the closed 
interval [a, b]. (It’s very important that the interval is closed at both ends.) 
That means that we put our pen down at the point (a, /(a)) and draw a curve 
that ends up at (b, f(b)) without taking our pen off the paper. The question 
is, how high can we go? In other words, is there any limit to how high up this 
curve could go? The answer is yes: there must be a highest point, although 
the curve could reach that height multiple times. 


One last example: let’s show that any polynomial of odd degree has at least 
one root. That is, let p be a polynomial of odd degree; I claim that there is 
at least one number c such that p(c) = 0. (This isn’t true for polynomials of 
even degree: for example, the quadratic x 2 1 doesn’t have any roots 一 its 
graph doesn’t cross the x-axis.) So, how do we prove my claim? 

The key is actually found in the methods of Section 4.3 of the previous 
chapter. There we saw that if p(x) is any polynomial and a n x n is its leading 
term, then 

lim 座 =1 and lim 幽 =1. 
x-^oo a n x n x^-oo a n x n 
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In symbols, let’s say that the function / defined on the interval [a, b] has a 
maximum at a: = c if /(c) is the highest value of / on the whole interval [a, 6]. 
That is, /(c) > f(x) for all x in the interval. The idea that I’ve been driving 
at is that a continuous function on [a, b] has a maximum in the interval [a, b]. 
The same is true for the limbo question, “how low can you go?” We’ll say 
that / has a minimum at a; = c if /(c) is the lowest value of / on the whole 
interval; that is, that /(c) < f(x) for all x in [a, b] . Once again, any continuous 
function on the interval [a, b] has a minimum in that interval. These facts form 
a theorem, sometimes known as the Max-Min Theorem, which can be stated 
as follows: 

Max-Min Theorem: if / is continuous on [a, 6], then 
f has at least one maximum and one minimum on [a, b]. 

Here are some examples of continuous functions on [a, 6] and their maxima 
and minima (these are the plurals of maximum and minimum, respectively, 
of course): 


a c d b I a c d b I a c d b I a c d b 



In the first graph, the function attains its maximum a,t x = c and its minimum 
o,t x = d. In the second, the function has a maximum a,t x = c but the 
minimum is at the left endpoint x = a. The third graph has a maximum at 
x = b but the minimum is at both x = c and x = d. This is acceptable — 
there are allowed to be multiple minima, as long as there is at least one. 
Finally, the fourth graph shows a constant function, which is continuous; in 
fact, every point in the interval [a, b] is both a maximum and a minimum, 
since the function never goes above or below the constant value C. 

So, why does the function / need to be continuous? And why can’t it 
be an open interval, like (a, 6)? The following diagrams show some potential 
problems: 



In the first figure, the function / has an asymptote in the middle of the interval 
[a, 6], which certainly creates a discontinuity. The function has no maximum 
value — it just keeps going up and up on the left side of the asymptote. Sim¬ 
ilarly, it has no minimum value either, since it just plummets way down on 
the right side of the asymptote. 













84 • Continuity and Differentiability 


The middle diagram on the previous page involves a more subtle situation. 
Here the function is only continuous on the open interval (a, b). It clearly has 
a minimum at x = c, but what is the maximum of this function? You might 
think that it occurs at x = 6, but think again. The function isn’t even defined 
at a; = 6! So it can’t have a maximum there. If the function has a maximum, 
it must be somewhere near b. In fact, you’d like it to be the number less than 
b which is closest to b. Unfortunately, there is no such number! Whatever you 
think the closest number is, you can always take the average of this number 
and b to get an even closer number. So there is no maximum; this illustrates 
that the interval of continuity has to be closed in order to guarantee that the 
Max-Min Theorem works. 

Of course, the conclusion of the theorem could still be true even if the 
interval isn’t closed. For example, the function in the third diagram above 
is only continuous on the open interval (a, 6), but it still has a maximum at 
x = c and a minimum at a: = d. This was just a lucky accident: you can only 
use the theorem to guarantee the existence of a maximum and minimum in 
an interval [a, 6] if you know the function is continuous on the entire closed 
interval. 

5.2 曲 ilify 

We’ve spent a while looking at continuity. Now it’s time to look at another 
degree of smoothness that a function can have: differentiability. This essen¬ 
tially means that the function has a derivative. So, we’ll spend quite a bit 
of time looking at derivatives. One of the original inspirations for develop¬ 
ing calculus came from trying to understand the relationship between speed, 
distance, and time for moving objects. So let’s start there and work our way 
back to functions later on. 

5.2.1 

Imagine looking at a photo of a car on a highway. The exposure time was 
very short, so it’s not blurry — you can’t even tell whether the car was moving 
or not. Now, I ask you this: how fast was the car moving when the picture 
was taken? No problem, you say~just use the classic formula 

distance 



The problem is that the photo conveys no sense of distance (the car hasn’t 
moved) or time (the photo essentially captures an instant of time). So you 
can’t answer my question. 

Ah, but what if I tell you that a minute after the picture was taken, the 
car had traveled one mile? Then you could use the above formula to see that 
the car was going at a mile a minute, or 60 mph. Still, how do you know 
that the car was going the same speed for that whole minute? It might have 
accelerated and decelerated many times during that minute. You have no 
idea how fast it was actually going at the beginning of that minute. In fact, 
the above formula isn’t really accurate: the left-hand side should say average 
speed, since that’s all we’ve found. 













Section 5.2.2: Displacement and velocity • 85 


OK, I’ll take pity on you and tell you that the car went 0.25 miles in the 
first 10 seconds. Now you can use the formula and see that the average speed 
over the first 10 seconds is 1.5 miles per minute, or 90 mph. This helps, but 
the car could still have changed its speed over the 10 seconds — we don’t really 
know how fast it was going at the beginning of the period. It’s unlikely that 
it was too far away from 90 mph because the car can only accelerate and 
decelerate so much in such a short time. 

It would be even better to know how far the car went in 1 second after the 
photo was taken, but it would still not be perfect. Even 0.0001 seconds might 
be enough for the car’s speed to change, but not by much. If you sensed that 
we’re heading toward whipping out a limit, you’d be quite right. We need to 
look at the concept of velocity first, though. 

5.2.2 Displdoement on^'^ocily 

Imagine that the car is driving down a long straight highway. The mile mark¬ 
ers are a little weird — there’s a 0 marker at some point, and to the left of it, 
the markers start at —1 and become more and more negative. To the right 
of the 0 marker, they go as normal. In fact, the whole situation looks exactly 
like a number line: 



2 


Suppose that the car starts at mile 2 and goes directly to mile 5. Then it 
has gone a distance of 3 miles. If instead it starts at mile 2 but goes left to 
mile —1, it’s also gone a distance of 3 miles. We’d like to distinguish between 
these two cases, so we’ll use displacement instead of distance. The formula 
for displacement is just 

displacement = (final position) — (initial position). 

If the car goes from position 2 to 5, then the displacement is 5 — 2 = 3 miles. 
If instead the car goes from 2 to —1, the displacement is (—1) — 2 = —3 miles. 
So displacement can be negative, unlike distance. In fact, if the displacement 
is negative, then the car ends up to the left of where it began. 

Another important difference between distance and displacement is that 
the displacement only involves the final and initial positions — what the car 
does in between is irrelevant. If it went from 2 to 11 and then back to 5, the 
distance is 9 + 6 = 15 miles but the total displacement is still only 3 miles. If 
instead it went from 2 to —4 and then back to 2, the displacement is actually 
0 miles even though the distance is 12 miles. It is true, however, that if the 
car just goes in one direction without backtracking, then the distance is the 
absolute value of the displacement. 

As we saw in the last section, average speed is the distance traveled divided 
by the time taken. If you replace distance by displacement, you get the average 
velocity instead. That is, 



time 
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Again, velocity can be negative while speed must be nonnegative. If the car 
has a negative average velocity over a certain time period, then it has ended 
to the left of where it began. If instead the average velocity is 0 over the time 
period, then the car has ended up exactly where it began. Notice that in this 
case the car might have a high average speed even though its average velocity 
is 0! In general, just like displacement, if the car is going in just one direction, 
then the average speed is just the absolute value of the average velocity. 

5.2.3 / 

We now revisit our crucial question in terms of velocity: how do you measure 
the velocity of the car at a given instant? The idea, as we saw above, is 
to take the average velocity of the car over smaller and smaller time periods 
beginning at the instant the photo was taken. Here’s how it works in symbols. 

Let t be the instant of time we care about. For example, if a race started at 
2 p.m., you might decide to work in seconds with 0 representing the starting 
time; in that case, if the photo was taken at 2:03 p.m. then you’d want to take 
t = 180. Anyway, suppose that uis a, short time later than t. Let’s write Vt<^ u 
to mean the average velocity of the car during the time interval beginning at 
time t and ending at time u. Now we just push u closer and closer to t. How 
close? As close as we can! That’s where the limit comes in. In fact, 


instantaneous velocity at time t = lim Vt ㈠ u . 

u—^t+ 

Why neglect what happens before time though? We can make the above 
definition a little more general by allowing u to be before t] then we can 
replace the right-hand limit by a two-sided limit: 


instantaneous velocity at time t = lim Vt^ u - 


Now we need a few more formulas. Let’s suppose we know exactly where on 
the highway the car is at any instant of time. In particular, suppose that at 
time the car is at position f(t). That is, let 

f(t) = position of car at time t. 

We can now calculate the average velocity Vt^ u exactly: 

position at time u — position at time t f(u) — f(t) 

Vt^u = - T - = - T - . 

U—t U—t 


Notice that the denominator u _ t is the length of time involved (provided 
that u is after* t). Anyway, now we just take a limit as u ^ t: 


instantaneous velocity at time t = 


f[U)_ f(t) 


Of course, you cannot just substitute u = tin the previous limit, because then 
you get the indeterminate form 0/0. You really do need to use limits. 


*If u is before t, then the denominator should be t — u, but then the numerator should 
be f(t) — /(w)，so it all works out! 
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One more little variation. Let’s define h = u — t. Then since u is very 
close to 亡 ， the difference h between the two times must be very small. Indeed, 
as w —> we can see that h — Q. If we make this substitution in the above 
limit, then because u = t + h, we also have 


instantaneous velocity at time i =lim ft ； 


f (t + h) — f(t) 
h 



There’s no real difference between this formula and the previous one; it’s just 
written a little differently. 

Let’s look at a quick example. Suppose that the car starts at rest at the 
7 mile marker, then accelerates to the right beginning at time t = 0 hours. It 
turns out that the car’s position at time t might be something like 15t 2 + 7 
(the number 15 here depends on the acceleration). Without worrying about 
why this is true, let’s just let f(t) = 15t 2 + 7 and see if we can find the velocity 
of the car at any time t. 

Using the above formula, we have 


instantaneous velocity at time t = 




lim 

h^O 


f (t h) — f(t) 
h 

(15(t + /i) 2 + 7)-(15t 2 + 7) 


h 


Now expand (t + h) 2 =t 2 -\- 2th + h 2 and simplify a bit to see that the above 
expression is 


^0 


15t 2 + 30 认 + 15/i 2 + 7 - Ibt 2 - 7 
h 




S0th-\- 15/i 2 
h 


lim (30t + 15/i). 


It’s particularly nice that the h gets canceled from the denominator in the 
last step, since that’s what was giving us all the trouble. Now we can just put 
h = 0 to see that 


instantaneous velocity at time t = lim (SOt + 15/i) = SOt. 

h - ^0 

So at time 0, the car’s velocity is 30 x 0 = 0 mph —— the car is at rest. Half an 
hour later, at time t = 1/2, its velocity is 30 x 1/2 = 15 mph. One hour after 
the start time, the velocity is 30. In fact, the fact that the velocity is 30t at 
time t tells us that the car gets faster and faster at the constant rate of 30 
mph every hour. That is, the car is constantly accelerating at 30 miles per 
hour per hour, or 30 miles per hour squared. 


5.2.4 The graphical interpretation of velocity 

It’s time to look at a graph of the situation. Suppose that f(t) again represents 
the position of the car at time t. If we want the instantaneous velocity at a 
particular time we need to pick a time u close to t. Let’s draw the graph 
of y = f(t) and mark in the points (t, f(t)) and (u, f{u)) as well as the line 
through them: 
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The slope of this line is given by 


slope 


u — t ， 


which is exactly the formula for the average velocity vt ㈠ u from the previous 
section. So we have a graphical interpretation for average velocity over the 
time period t to u: it’s the slope of the line joining the points [t ， f(t)) and 
(u, f(u)) on the graph of position versus time. 

Let’s try to find a similar interpretation for the instantaneous velocity. We 
need to take the limit as u goes to so let’s repeat the previous graph a few 
times, each time with u closer and closer to the fixed value t: 



The lines seem to be getting closer to the tangent line at the point 
Since the instantaneous velocity is the limit of the slopes of these lines as 
u — t, we’d like to say that the instantaneous velocity is exactly equal to the 
slope of the tangent line through (t, f (t))• Looks like we need to understand 
tangent lines better •… 

5.2.5 ■ 兔 lines 

Suppose we pick a number x in the domain of some function /. Then the 
point (x, f(x)) lies on the graph of y = f(x). We want to try to draw a line 
through that point which is tangential to the curve — that is, we want to find a 
tangent line. Intuitively, this means that the line we’re looking for just grazes 
the curve at our point The tangent line doesn’t have to intersect 

the curve only once! For example, the tangent line through (x, f(x)) in the 
following picture hits the curve again, and that’s not a problem: 
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It’s possible that there’s no tangent line through a given point on a graph. 
For example, consider the graph of y = \x\: 



The graph passes through (0,0), but there’s no tangent line through that 
point. What could the tangent line possibly be, after all? No matter what 
you draw, you can’t cuddle up to the graph there since it’s got a sharp point 
at the origin. We’ll return to this example in Section 5.2.10 below. 

Even if the tangent line through (x, f(x)) exists, how on earth do you 
find it? Remember, to specify a line, you only need to provide two pieces 
of information: a point the line goes through and its slope. Then you can 
use the point-slope form to find the equation of the line. Well, we have one 
ingredient: we know the line passes through the point (x, f(x)). Now we just 
need to find the slope. To do this, we’ll play a game similar to the one we 
played with instantaneous velocities in the previous section. 

Start by picking a number z which is close to x (either to the right or to 
the left) and plot the point (z, f(z)) on the curve. Now draw the line through 
the points (x^ f(x)) and (z, 
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Since the slope is the rise over the run, the slope of the dashed line is 

~ 之 -x . 

Now, as the point z gets closer and closer to without ever actually getting 
to x itself, the slope of the above line should become closer and closer to the 
slope of the tangent we’re looking for. So, if there’s any justice in the world, 
then it should be true that 

f (z^) _ f 

slope of tangent line through (x, f(x)) = lim - . 

Z^x Z — X 

Let’s set h = z — x; then we see that B.S z ^ x, we have h ^ 0, so we also 
have 


slope of tangent line through (a:, f(x)) = lim ^ 

Of course, this only makes sense if the limit actually exists! 

5.2.6 The derivative function 

In the following picture, I’ve drawn in the tangent lines through three different 
points on the curve: 



These lines have different slopes. That is, the slope of the tangent line de¬ 
pends on which value of x you start with. Another way of saying this is that 
the slope of the tangent line through (x, f(x)) is itself a function of x. This 
function is called the derivative of / and is written as f. We say that we have 
differentiated the function / with respect to its variable x to get the function 
/’• By the formula at the end of the previous section, we see that 


f{x + h)~ f(x) 
h 


provided that the limit exists. In this case, we say that / is differentiable at x. 
If the limit doesn’t exist for some particular x, then that value of x is not in 
the domain of the derivative function /'so we say that / is not differentiable 
at x. The limit could fail to exist for a variety of reasons. In particular, there 
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could be a sharp corner as in the example oiy = \x\ above. On an even more 
basic level, if x isn’t in the domain of /, then you can’t even plot the point 
(x, let alone draw a tangent line there! 

Now let’s recall the definition of instantaneous velocity in Section 5.2.3 
above: 


instantaneous velocity at time t = 


lim f(t + h ) -則 
h — >0 h 



where f(t) is the position of the car at time t. This right-hand side of this 
the same as the definition of f(x) above, except with x replaced by t\ That 
is, if v(t) is the instantaneous velocity at time t, then v(t) = Velocity 

is precisely the derivative of position with respect to time. 

Let’s look at one example of finding a derivative. If f(x) = x 2 , what 
is /’($)? The computation is very similar to the one we did at the end of 
Section 5.2.3 above: 


f(x + /i) - f(x) _ lim (x + h) 2 - x 2 
h h — h 

2xh H- h 2 


h — ^0 tl 

=lim(2 a： + / l )=2x. 


So the derivative of f(x) = x 2 is given by f f (x) = 2x. This means that the 
slope of the tangent to the parabola y = x 2 a,t the point (x^x 2 ) is precisely 
2x. Let’s draw the curve and a few tangent lines to check it out: 



The slope of the tangent at a; = —1 does indeed look like it’s about —2, which 
is consistent with the formula f f (x) = 2x. (Twice —1 is —2!) The same 
is true with the other tangents — their slopes are all twice the corresponding 
☆coordinate. 

5.2.7 The derivative as a limiting ratio 

In our formula for the derivative we have to evaluate the quantity 

f(x + h). What is this quantity? Well, if y = f(x), and you change x into 
x h, then f(x-\-h) is simply the new value of y. The amount h represents 
how much you changed x, so let’s replace it by the quantity Ax. Here the 
symbol A means “change in,” so that Ax is just the change in x. (Don’t think 
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of Ax as the product of A and x — this is just plain wrong!) So, let’s rewrite 
the formula for f\x) with h replaced by Ax: 




f(x + Aa:) - f(x) 
Ax 


OK, here’s what happens. We start out with our pair (x, y), where y = f(pc). 
We now take a new value of x, which we’ll call x new . The value of y then 
changes as well to a new value y n ew，which of course is just f(x new ). Now, the 
amount of change of any quantity is just the new value minus the old one, so 
we have two equations: 


△a: = x new 


and 


△y = 2/new - y- 


The first equation says that x ne w = x Ax, so now the second equation can 
be transformed as follows: 


△2/ = 2/new _ V = / (^new) _ /(^) = /(怎 + △$) _ /($). 

But this is just the numerator of the fraction in the definition of f f (x) above! 
What this means is that 

，’ ㈤ : ㉟ 。备 

An interpretation of this is that a small change in x produces approximately 
f(x) times as much change in y. Indeed，if y = f(x) = x 2 , then we’ve seen 
in the previous section that f(x) = 2x. Let’s concentrate on what happens 
when x = 6, for example. First，note that our formula for f f (x) shows us that 
/ 7 (6) = 2 x 6 = 12. So, if you take the equation 6 2 = 36 and change the 6 a 
little bit, the 36 will change by about 12 times as much. For example, if we 
add 0.01 to 6, we should add 0.12 to 36. So I’m saying that (6.01) 2 should be 
about 36.12. In fact, the actual answer is 36.1201，so I was really close. 

Now, why didn’t I get the exact answer? The reason is that f f (x) isn’t 
actually equal to the ratio of Ay to Ax: it’s equal to the limit of that ratio 
as Ax tends to 0. This means that if we don’t move as far away from 6, we 
should do even better. Let’s try to guess the value of (6.0004) 2 . We have 
changed our original value 6 by 0.0004, so the y-value should change by 
12 times this much, which is 0.0048. Our guess is therefore that (6.0004) 2 
is approximately 36.0048. Not bad ― the actual answer is 36.00480016, so we 
were very close! The smaller the change from 6, the better our method will 
work. 

Of course, the magic number 12 only works when you start at a: = 6. 
If instead you start at x = 13， the magic number is /’(13), which equals 
2 x 13 = 26. So, we know 13 2 = 169; what is (13.0002) 2 ? To get from 13 
to 13.0002, you have to add 0.0002; since the magic number is 26, we have 
to add 26 times as much to 169 to get our guess. That is, we add 0.0052 to 
169 and come up with the guess 169.0052. Again, that’s pretty darn good: 
(13.0002) 2 is actually exactly 169.00520004. 

Anyway, we’ll return to these ideas in Chapter 13 when we look at lin¬ 
earization. For now, let’s look at the formula 
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once again. The right-hand side is the limit of the ratio of the change in y 
to the change in x, as the change in x becomes small. Suppose that x is 
so small that the change is barely noticeable. Instead of writing Aa;, which 
means “change in we’d now like to write dx, which should mean “really 
really tiny change in and similarly for y. Unfortunately neither dx nor dy 
really means anything by itself;* nevertheless this provides the inspiration for 
writing the derivative in a different, more convenient way: 

if y = /(x), then you can write ^ instead of 
dx 

For example, if y = x 2 , then = 2x. In fact, if you replace y by x 2 , you get 
a variety of different ways of expressing the same thing: 



As another example, in Section 5.2.3 above, we saw that if the position of a 
car at time t is f{t) = 15t 2 + 7, then its velocity is 30t. Remembering that 
velocity is just f(t), this means that f(t) = 30t. If instead we decided to 
call the position p, so that p = 15t 2 + 7, we could write ^ = 30t. The point 
is that not everything comes in x’s and y’s — you have to be able to deal with 
other letters. 

In summary, the quantity 慈 is the derivative of y with respect to x. If 
y = f(x), then 慈 and f(x) are the same thing. Finally, remember that the 
quantity is not actually a fraction at all 一 it’s the limit of the fraction 
as Ax 0. 


5.2.8 The derivative of linear functions 

Let’s just pause for breath and go back to a simple case: suppose that / is 
linear. This means that f(x) = mx-\-b for some m and b. What do you think 
that 尸 (x) should be? Remember, this measures the slope of the tangent to 
the curve y = f(x) at the point (x, In our case, the graph of y = mx-\-b 

is just a line of slope m and y-intercept equal to b. If there’s any justice in the 
world, then the tangent at any point on the line is just the line itself! This 
means that the value of f f (x) should be m no matter what x is: the curve 
y = mx + b has constant slope m. Let’s check it out using the formula: 


/’(aO = h 


f(x + h) 


f(x) _ ■ (m(x + fe) + b) - (mx + b) 
h^o h 


=lim = lim m = rrt. 
h — ^0 h h — ^0 

So there is justice in the world: f f (x) = m regardless of what x is. That 
is, the derivative of a linear function is constant. As you might expect, only 
linear functions have constant slope (this is a consequence of the so-called 
Mean Value Theorem; see Section 11.3.1 in Chapter 11). By the way, if / is 
actually constant, so that f(x) = 6, then the slope is always 0. In particular, 
f(x) = 0 for all x. So we’ve proved that the derivative of a constant function 
is identically 0. 


* There is a theory of “infinitesimals,” but it’s beyond the scope of this book! 
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5.2.9 Second and higher-order derivatives 



Since you can start with a function / and take its derivative to get a new 
function /’，you can actually take this new function and differentiate it again. 
You end up with the derivative of the derivative; this is called the second 
derivative, and it’s written as 

For example, we’ve seen that if f(x) = x 2 , then the derivative f f (x) = 2x. 
Now we want to differentiate this result. Let’s put g(x) = 2x and try to work 
out g f (x). Since ^ is a linear function with slope 2, we know from the previous 
section above that g f (x) = 2. So the derivative of the derivative of / is the 
constant function 2, and we have shown that /’’(^) = 2 for all x. 

liy = f(x), then we’ve seen that we can write 兹 instead of /’($)• There’s 
a similar sort of notation for the second derivative: 


d 2 y 

if y = f(x), then you can write —^ instead of f ff (x). 

ax z 


In the above example, if y = f(x) = x 2 , then we’ve seen that 


斤)去響=>)= 


These are all valid ways of expressing that the second derivative of f(x) = x 2 
(with respect to x) is the constant function 2. 

Why stop at taking two derivatives? The third derivative of a function 
/is the derivative of the derivative of the derivative of /. That’s a lot of 
derivatives! Realistically, you should think of the third derivative of / as 
being the derivative of the second derivative of /, and you can write it in any 
of the following ways: 

/' 〃⑷， / ⑶⑷，0， 0r 

The notation /⑶⑷ is particularly convenient for higher derivatives, because 
writing so many apostrophes is just plain stupid. So, for example, the fourth 
derivative, which is just the derivative of the third derivative, would be written 
/( 4 )($) and not That said, it will sometimes be convenient to go the 

other way and write /( 2 )($) for the second derivative instead of /"($). It’s 
even possible to write f^\x) instead of f\x), since we are only taking one 
derivative, and also /( 0 )( 怎 ) instead of just f(x) itself (no derivatives!). That 
way, any derivative can be written in the form /( n ) (x) for some integer n. 


5.2.10 


When the derivative does not exist 

In Section 5.2.5 above, I mentioned that the graph of f(x) = \x\ has a sharp 
corner at the origin. This should mean that the derivative doesn’t exist at 
x = 0. Now let’s try to see why this is. Using the formula for the derivative, 
we have 


f(x) = lim 
h—^0 


/(X + ") - f(x) 

h 


1^0 


\x + h\- \x\ 

h 


We are interested in what happens when rc = 0, so let’s replace a: by 0 in the 
above chain of equations: 


，⑼十 0 


f(o+h)-m 


^0 


|0 + /i| — |0| 


limM. 

h^O h 
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5.2.11 Differentiability and continuity 

Now it’s time to relate the two big concepts in this chapter. I’m going to show 
that every differentiable function is also continuous. Another way of looking 
at this is that if you know a function is differentiable, you get the continuity 
of your function for free. More precisely, I will show: 

if a function / is differentiable at x, then it’s continuous at x. 

For example, we’ll show in Chapter 7 that sin(a:) is differentiable as a function 
of x. This will automatically imply that it’s also continuous in x. The same 
goes for the other trig functions, exponentials, and logarithms (except at their 
vertical asymptotes). 

So, how do we prove our big claim? Let’s start by seeing what we want to 
prove. To show that / is continuous at x, we’re going to need to show that 

lim f(u) = f(x), 

U—^X 

remembering from Section 5.1.1 above that this equation can only be true 
if both sides actually exist! Before we proceed farther, I want to substitute 
h = u — x as weVe done before. In that case, u = x h, and as w —> a:, we 
see that /i — 0. So the above equation can be replaced by 


lim/(a; + ") = f{x). 


We need to show that both sides exist and that equality holds — then we’ll be 
all done. 

Now that we are aware of our destination, let’s start with what we actually 
know. Well, we know that / is differentiable at x; this means that f r (x) exists, 
so by the definition of f r , the limit 

f(x + h)~ f{x) 
h 


exists. Let’s first notice that f(x) is involved in this formula, so it must exist 
or else the formula is all whacked. So we’ve already gotten somewhere: we 
know that f(x) exists. We still need to do something clever. The trick is to 
start with another limit: 


lim 

h^o \ h 



On the one hand, we can work out this limit exactly by splitting it into two 
factors: 


1 ^( 


f{x + h)~ f(x) 


x h 


lim 

h-^0 


/Or + /i) - f(pc) 


lim/i = f’(x) x 0 = 0. 


This works just fine because all the limits involved exist. (That’s where you 
need the fact that f r {x) exists — otherwise it wouldn’t work.) On the other 
hand, we could have taken the original limit and instead canceled out the 
factor of h to get 



fix + h)~ f(x) 



=lim (/(a; + h)~ f(x)). 


h 
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Comparing these two previous equations, we just have 
lim (f(x -\-h)- f(x)) = 0. 

Of course, the value of f(x) doesn’t depend on the limit at all, so we can pull 
it out and see that 

Him/(a: + /i) j - f(x) = 0. 

Now all we have to do is add f(x) to both sides to get 
lim/(a: + ") = f(x) 






CHAPTER 6 _ 

How to Solve Differentiation Problems 


Now we’ll see how to apply some of the theory from the previous chapter to 
solve problems involving differentiation. Finding derivatives from the formula 
is possible but cumbersome, so we’ll look at a few rules that make life a lot 
easier. All in all, here’s what we’ll tackle in this chapter: 

• finding derivatives using the definition; 

• using the product, quotient, and chain rules; 

• finding equations of tangent lines; 

• velocity and acceleration; 

• finding limits which are derivatives in disguise; 

• how to differentiate piecewise-defined functions; and 

• using the graph of a function to draw the graph of its derivative. 


6.1 Finding Derivatives Using the Definition 



Let’s say we want to differentiate f(x) = 1/x with respect to x. We know 
from the previous chapter that the definition of the derivative is 


f\x) = lim 
h—^0 


f{x + fe)~ f(x) 
h 


so in our case we have 


If you just replace /i by 0 in the fraction, you end up with the indeterminate 
form So you need to work a little. In this case, the idea is to simplify the 
numerator by taking a common denominator. You get 

x — (x h) 


尸⑷ = 鰓 X(X h h) = ^oh^ThTy 
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Now cancel out a factor of h from top and bottom, then evaluate the limit by 
setting h = 0: 


f(x) = lim 


h^o x(x + h) x{x) 


That is, 


◎ 


d (\ 

\ i 

dx \x 

) = ~^- 


On the other hand, to find the derivative of f(x) = y/x, you have to employ 
the trick that we used in Section 4.2 of Chapter 4. Here’s how it goes: 

f(x) = lim f〔 x + h l -⑽ = lim - 办 , 

h—Q h h^O h 

and we are again in ^ territory. Let’s multiply top and bottom by the conju¬ 
gate of the numerator to get 


/W = lim x = lim 


(x h) — x 


i y/x + h + y/x h^o h(y/x + h + V^) ’ 

now we can cancel the x terms on the top, cancel a factor of h from top and 
bottom, and take the limit to see that 




lim 


h{y/x + h + “0 y/x + h + y/x 

In summary, we have shown that 


\fx + y/x 2y/x' 


◎ 




Now how would you find the derivative of f(pc) = y/x + x 2 using the 
definition of the derivative? Even if you can just write down the answer, 
I’ve asked you to use the definition, so you must put all temptations aside 
and use the formula: 

f (r) _ lim f( x + h )~ f( x ) _ W^ + h + (a ： + hf) - {y/^+x 2 ) 

f {x) - hZ h 十 o h • 

This looks pretty messy, but if we split it up into the terms involving the 
square-root stuff and the terms involving the square stuff, we see that 

/^)=lim + lim + - 

h-^0 h h-^0 h 

We know how to do both of these limits; we have just seen that the first one 
is l/2y/x, and we did the second one in Section 5.2.6 of the previous chapter 
and got the answer 2x. You should try doing both of them without looking 
back at the previous work and make sure you get the answer 


f ， {x)= 2VB +2X - 
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It’s now time to find the derivative of x n with respect to a;, where n is 
some positive integer. Set f(x) = x n \ then we have 

/' ⑷ = lim = lim ㈣ )广， 

h ― ^0 h h — >0 h 


Somehow we have to deal with (x-\-h) n . There are several ways of doing this; 
let’s try the most direct approach, which is to write 


(x + h) n = (x-\- h)(x + /i) • • • (a; + h). 


There are n factors in the above product. This would be a real mess to 
multiply out, but it turns out we don’t need to do the whole thing — we just 
need to get started. If you take the term x from each factor, there are n of 
them, so you get one term x n in the product. That’s the only way to get all 
x factors, so we already have 


(x + h) n = (x + h)(x h) • • • (xh) = x n stuff involving h. 


We need to do a little more work, though. What if you take the term h from 
the first factor and x from the others? Then you have one h and (n — 1) copies 
of x, so you get hx n - 1 when you multiply them all together. There are other 
ways to choose one h and the rest x — you could take the h from the second 
factor and all others x, or the h from the third factor, and so on. In fact, 
there are n ways you could pick one h and the rest x : so you actually have 
n copies of hx n 一 1 . Together, this makes nhx n ~ x . Every other term in the 
expansion has at least two copies of "，so every other term has a factor of h 2 . 
All in all, we can write 

(x-\-h) n = (x-\-h)(x-\-h) - - - (x-\-h) = x n + nhx n ~ 1 + stuff with h 2 as a factor. 

Let’s tidy this up one little bit: we’ll write the “stuff with h 2 as a factor” in 
the form h 2 x (junk), where “junk” is just a polynomial in x and h. That is, 

{x + h) n = (x h)(x -\-h)' = x n nhx n ~ x + " 2 x (junk). 

Now we can substitute into the formula for the derivative: 

+ h) n - x n v a: n + nhx 71 - 1 +/i 2 x (junk) - x n 

= ^ h ^ = 1^0 - h - . 

The x n terms cancel, and then we can cancel out a factor of h: 

/' ⑷ =lim -W+f 乂 0 —) = (junk)). 

h — >0 ri h — ^0 

As /i — 0 ， the second term goes to 0 (since the junk is pretty benign and 
doesn’t blow up!), but the first term remains as nx n ~ l . So we conclude that 

when n is a positive integer. In fact, we’ll show in Section 9.5.1 of Chapter 9 
that _ 
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when a is any real number at all. In words, you are simply taking the power, 
putting a copy of it out front as the coefficient, and then knocking the power 
down by 1. 

Let’s take a closer look at the above formula. First, when a = 0, then x a 
is the constant function 1. The derivative is then Ox -1 , which is just 0. This 
agrees with the computation we did in Section 5.2.8 of the previous chapter; 
in summary, _ 

if C is constant, then -^-(C) = 0. 

ax 


Now, if a = 1, then x a is just x. According to the formula, the derivative 
is la; 0 , which is the constant function 1. Again, this agrees with our results 
from Section 5.2.8 of the previous chapter; we have confirmed that 




When a = 2, then we see that the derivative of x 2 with respect to x is 2a: 1 , 
which is just 2x. This agrees with what we found previously. Similarly, when 
a = —1, we can use our formula to see that the derivative of x— 1 is —lx x~ 2 . 
In fact, this just says that the derivative of 1/a: is —1/x 2 , which we already 
knew from the beginning of this section! This example comes up so often that 
you should just learn it individually. 

Now let’s try some fractional powers. When a = ^, you see that the 
derivative with respect to x of x 1 ^ 2 is - 1 / 2 . By the exponential rules (see 
Section 9.1.1 in Chapter 9 for a review of these!), you can rewrite this and see 
that the derivative of y/x is l/2y/x, which is exactly what we found above. 
Again, this comes up so often that it’s worth learning it individually so that 
you don’t have to mess around with powers of \ and —Finally, let’s try 
d= Our formula says that 




= k 2/3 . 


Using exponential rules (again, you can find these in Section 9.1.1 of Chap¬ 
ter 9), we can rewrite this whole thing as 



3W 


This one is a little more esoteric, so I wouldn’t worry about learning it. Just 
make sure you can derive it using the formula for the derivative of x a with 
respect to x from the box above. 


6 . 餘 ’ Finding Derivatives (fsfe Nice Way) 

All this messing about with limits in order to find derivatives is getting a bit 
tedious. Luckily, once you do it, you can build up other derivatives from the 
ones you’ve already found by means of simple rules. Let’s define a function / 
as follows: 

" 、 3a: 7 + x 4 V2x 5 + 15o: 4 /3 - 23a: + 9 

f ㈤ = 


6x 2 — 4 
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The key to differentiating a function like this is to understand how it is syn¬ 
thesized from simpler functions. In Section 6.2.6 below, we’ll see how to 
use simple operations — multiplication by a constant, adding and subtracting, 
multiplying, dividing, and composing functions — to build / from atoms of the 
form x a , which we already know how to differentiate. First we need to see how 
taking derivatives is affected by each of these operations; then we’ll come back 
and find 尸 (x) for our nasty function / above. (See Section A.6 of Appendix A 
for proofs of the rules below, although there are intuitive justifications of some 
of them in Section 6.2.7.) 

6,2.1: Gonstartf/priultiples of functions* 

It’s easy to deal with a constant multiple of a function: you just multiply by 
the constant after you differentiate. For example, we know the derivative of 
x 2 is 2x\ so the derivative of 7x 2 is 7 times 2x, or 14a;. The derivative of —x 2 
is —2x, since you can think of the minus out front as multiplication by — 1. 
There’s actually an easy way to take the derivative of a constant multiple of 
x a . Simply bring the power down, multiply it by the coefficient, and then 
knock the power down by one. So for the derivative of 7a: 2 , bring the 2 down, 
mulitply it by 7 to get the coefficient 14, then knock the power of x down by 
one to get 14a: 1 or just 14a:. Similarly, to find the derivative of 13a; 4 , multiply 
13 by 4, giving a coefficient of 52, and then knock the power down by one to 
get 52a: 3 . 

6.2.2 Sums ctnd differs,functions 

◎ 


First write l/y/x as a: -1 / 2 , so this means that we really have to differentiate 
3a: 5 — 2a: 2 + 7a: -1 / 2 + 2. Using the method for constant multiples that we 
have just seen, the derivative of 3a; 5 is 15a; 4 ; similarly, the derivative of —2a: 2 
is —4$, and the derivative of 7a: -1 / 2 is —Finally, the derivative of 2 
is 0, since 2 is a constant. That is, the +2 at the end is irrelevant, as far as 
taking derivatives is concerned. So, we can just put the pieces together to see 
that 

丟 (3a: 5 -2x 2 + -^=+2 S j = ■^-(3 x 5 -2x 2 +7x~ 1 / 2 +2) = 15a: 4 -4a;-far 3 / 2 . 

By the way, it’s useful to realize that you can write x 3 ^ 2 as Xy/x, so you could 
also write the above derivative as 

15x 4 — 4x — 

2 XyJX 

Similarly, a: 5 〆 2 is the same as x 2 y/x, and x 7 / 2 is the same as x S y/x, and so on. 


It’s even easier to differentiate sums and differences of functions: just differen¬ 
tiate each piece and then add or subtract. For example, what’s the derivative 
with respect to x of 

3a; 5 - 2a: 2 + -^ + 2? 
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6.2.3 Products of functions via the product rule 

It’s a little trickier dealing with products — you can’t just multiply the two 
derivatives together. For example, let’s say we want to find the derivative of 

h(x) = (x 5 + 2x — l)(3x 8 — 2x 7 — x 4 — 3x) 

without expanding everything first (that would take way too long). Let’s set 
f(x) = x 5 -\-2x — 1 and g(x) = 3x 8 — 2x 7 — x A — 3a:. The function h is the 
product of / and g. We can easily write down the derivatives of / and g: they 
are f’(x) = 5a: 4 + 2 and g f (x) = 24a: 7 — 14a: 6 — 4x 3 — 3. As I said, it’s not true 
that the derivative of the product h is the product of these two derivatives. 
That is, h’(x) ★ (5x 4 + 2) (24a: 7 — 14x 6 — 4x s — 3). Ifs no good saying what 
h f (x) isn’t — we need to say what it is! 

It turns out that you have to mix and match. That is, you take the 
derivative of / and multiply it by g (not the derivative of g). Then you also 
have to take the derivative of g and multiply it by f. Finally, add the two 
things together. Here’s the rule: 


So, for our example of h(x) = (x 5 -\-2x — l)(3a: 8 — 2a; 7 — x 4 — 3x), we write 
h as the product of / and g and then take their derivatives, as we did above. 
Let’s summarize what we found, taking a column each for / and g: 

f(x) = x 5 -\-2x —1 g(x) = 3x 8 — 2x 7 — x 4 — 3x 

f\x) = 5a: 4 + 2 g f {x) = 24a: 7 — 14a: 6 — 4a: 3 — 3. 

Now we can use the product rule and do a sort of cross-multiplication. You 
see, we need to multiply f r {x) on the bottom left by g(x) on the top right, 
then add to this the product of f(x) from the top left and g\x) from the 
bottom right. So we get 

h'(x) = f'(x)g(x) + f(x)g'(x) 

= (5a; 4 + 2) (3a: 8 — 2x 7 — x 4 — 3a:) 

'.(a; 5 + 2x_l)(24a; 7 _14a; 6 _4 ； E 3 _3). 





You could multiply this out, but it would be even worse than multiplying out 
the original function h and then differentiating that. Just leave it as it is. 

There’s another way to write the product rule. Indeed, sometimes you 
have to deal with y = stuff in x, instead of the f(x) form. For example, 
suppose y = (x s 2x)(Sx + yj~x + 1). What is dy/dxl In this case, it’s easier 
to let u = x s -\-2x and v = 3x-\- y/x + 1. Then we can take the above form of 
the product rule and make some replacements: first, u replaces /(a:), so that 
du/dx replaces f(x)] we also do the same thing with v and g(x). We get 
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So, in our example, we have 
u = x s -\-2x 


dx 

This means that 


= 3工 2 + 2 


V = 3x y/x + 1 
dv_ 1 
dx + 2y/x' 


◎ 


+U £ = (3a： + x/i+ 1)(32：2 + 2) + (a；3 + 2a：) ( 3 + 2 ^)- 
What if you have a product of three terms? For example, suppose 
y = (x 2 + l){x 2 + 3x)(x 5 + 2a; 4 + 7) 


and you want to find dy/dx. You could multiply it all out and differentiate, 
or instead you could use the product rule for three terms: 


Product rule (three variables): if y = uvw, then 


dy_ 

dx 


du 

—vw + 
dx 


dv dw 

u—w + uv—. 
dx dx 


Before we finish the example, here’s a tip for remembering the above formula: 
just add up uvw three times, but put a d/dx in front of a different variable in 
each term. (The same trick works for four or more variables — every variable 
gets differentiated once!) Anyway, in our example, we’ll let u = x 2 1, 
v = x 2 3x, and w = x b -\- 2x A + 7, so that y is the product uvw. We have 
du/dx = 2a:, dv/dx = 2ar + 3, and dw/dx = bx 4 -\-Sx s . According to the above 
formula, we have 


dy du dv dw 

— =—vw + u—w + uv^— 
ax ax ax ax 

=(2x)(x 2 + 3a:) (a: 5 + 2x 4 + 7) + (x 2 + l)(2x + 3) (a: 5 + 2a; 4 + 7) 
H- (x 2 + l){x 2 + 3a:)(5a: 4 + 8a: 3 ). 


Since we didn’t multiply out and simplify the original expression for y above, 
I’m certainly not going to simplify the derivative! I do want to mention, 
however, that you can’t always multiply everything out. Sometimes you just 
have to use the product rule. For example, after you learn how to differentiate 
trig functions in the next chapter, you’ll want to be able to use the product 
rule to find derivatives of things like xsm(x). You can’t really multiply this 
expression out 一 it’s already as expanded as it can get. So if you want to 
differentiate it with respect to x, there’s no easy way of avoiding using the 
product rule. 


6.2.4 Quott^rfs offunotioris-via the quoter#；rute 


◎ 


Quotients are handled in a way similar to products, except that the rule is a 
little different. Let’s say you want to differentiate 


h(x) 


2x 3 - 3x + 1 
x 5 - 8x 3 + 2 
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with respect to x. You can let f(x) = 2x s -3x-\-l and g(x) = x 5 — Sx s + 2; 
then you can write h as the quotient of / and g, or h(x) = f(x)/g(x). Now 
here’s the quotient rule: 


Quotient rule (version 1): if h(x)= 


/ ⑷ 

9{x) 


,then 


h\x )= 


f(x)g(x) - f(x)g f (x) 

(9(x)) 2 


Notice that the numerator of the right-hand fraction is the same as the nu¬ 
merator in the product rule, except with a minus instead of a plus. In our 
example, we need to differentiate / and g and summarize our results: 


f(x) = 2a: 3 — 3$ + 1 g{x) = x b — 8a: 3 + 2 

f\x) = 6a: 2 — 3 g f {x) = 5a; 4 — 24a: 2 . 

By the quotient rule, since h(x) = f(x)/g(x), we have 
— 尸 ( 咖⑻ - fix)g'{x) 

{) ~ iaW 2 

_ (6a; 2 - 3) (a: 5 - 8a; 3 + 2)- (2a: 3 -3x+_ l)(5x 4 - 24a: 2 ) 
(a: 5 — 8a; 3 + 2) 2 



There’s also another version, just as there is in the case of the product rule. 
If instead you are given that 


3a; 2 + 1 
V= 2^7^ 


and you want to find dy/dx, then start by writing u = 3x 2 +1 and v = 2x s — 7, 
so that y = u/v. Now we use: 


Quotient rule (version 2): \iy = —, then 
du dv 
dy v Tx~ u Tx 
dx v 2 


Our summary box looks like this: 


u = 3a: 2 + 1 


du 

dx 


= 6a: 


By the quotient rule, 


= 2x s -1 


dx 


=16a; 7 . 


du dv 

dy V di~ U d^c (此 8 — 7)(6x) - (3x 2 + 1)(16/) 

dx— v 2 — (2x 8 - 7) 2 


As you can see, quotients aren’t any harder than products (just a bit messier). 







6 . 2.5 
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Composffion of functions via the chain rule 

Suppose h(x) = (x 2 + 1)" and you want to find h f (x). It would be ridiculous 
to multiply it out —— you’d have to multiply a: 2 + 1 by itself 99 times and it 
would take days. It would also be crazy to use the product rule, since you’d 
need to use it too many times. 

Instead, let’s view h as the composition of two functions / and 仏 where 
g(x) = x 2 -\-l and f(x) = x". Indeed, if you take your x and hit it with g, 
you end up with x 2 -\-l. If you now hit that with /, you get (x 2 + 1)", which 
is just h(x). So we have written h(x) as (Check out Section 1.3 in 

Chapter 1 for more on how composition of functions works.) Now we can 
apply the chain rule: 


I Chain rule (version 1): if h(x) = then h\x) = f\g{x))g r (x)^ 



The formula looks a little tricky. Let’s decompose it. The second factor is 
easy: it’s just the derivative of g. How about the first factor? Well, you have 
to differentiate /, then evaluate the result at g(x) instead of x. 

In our example, we have f(x) = x", so f f (x) = 99a; 98 . We also have 
g(x) = x 2 1, so g r {x) = 2x. There’s our second factor: just 2x. How about 
the first one? Well, we take /’ ⑷， but instead of x, we put in x 2 + 1 (since 
that’s what g(x) is). That is, = f(x 2 + 1) = 99(x 2 + l) 98 . Now we 

multiply our two factors together to get 


h!{x) = f r (g{x))g ; (x) = 99(x 2 + l) 98 (2x) = 19Sx(x 2 + l) 98 . 



This might seem a little tortuous, to say the least. Here’s another way to 
solve the same problem. 

We start with y = (x 2 + l) 99 and we want to find dy/dx. The (x 2 + 1) 
term makes life difficult, so we’ll just call it u. This means that y = u" where 
u = x 2 1. Now we can invoke the other version of the chain rule: 


Chain rule (version 2): if y is a function of u, 
and w is a function of x, then 
dy dy du 

dx du dx 


So in our case, we have 

y = u" 

字 = 99u 98 
du 


du 

dx 


= 2x. 


Using the chain rule formula in the box above, we see that 



= 99w 98 x 2x= 198 抓 98 . 


Now you just need to tidy it up by replacing w by a: 2 + 1 to see that we have 
dy/dx = 19Sx(x 2 + l) 98 , as we found above. 
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Here’s another straightforward example, 
irt by setting u = x 3 — 7x, so that y = y/ 


Our table looks like 


►w we just have 
see that 


rid of the u in the denominator; since 


Not so bad when you get the hang of it. 

Two quick comments on the chain rule. First, why is it called the chain 
rule, anyway? Well, you start with x and it gives you u; then you take that 
u and get y. So there’s a sort of chain from x to y through the extra variable 
u. Second, you might think that the chain rule is obvious. After all, in the 
formula in the box on the previous page, can’t you just cancel out the factor 
of dul The answer is no — remember, expressions like dy/du and du/dx aren’t 
actually fractions, they are limits of fractions (see Section 5.2.7 in the previous 
chapter for more on this). The nice thing is that they often behave as if they 
were fractions 一 they certainly do in this case. 

The chain rule can actually be invoked multiple times all at once. For 
example, let 

y= (Or 3 — 10x) 9 + 22) 8 . 

What is dy/dxl Simply let u = x s — lOx, and v = u 9 22, so that y = v 8 . 
Then use a longer form of the chain rule: 


You can’t get this wrong if you think about it: y is a function of v, which 
a function of u, which is a function of x. So there’s only one way the formu] 
could possibly look! Anyway, we have 


Plugging everything 


=(8v 7 )(9u 8 )(3a; 2 — 10). 


First，replace v by 


= (8v 7 )(9u 8 )(3a 
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◎ 


Now replace uby x s — lOx and group the factors of 8 and 9 together to get 
the actual answer: 

= (8(u 9 +22) 7 )(9u 8 )(3a: 2 -10) = 72((a; 3 -10a:) 9 +22) 7 (a; 3 -10a;) 8 (3a; 2 -10). 

We’ve mostly used the second version of the chain rule above, but there are 
times when the first version comes in useful. For example, if you know that 
h(x) = y/g(x) for some functions g and h, and all you know about g is that 
夕 (5) = 4 and ^(5) = 7, then you can still find ft/(5). Just set f(x) = y/x so 
that h(pc) = f{g(x))^ then use the formula h\x) = f f (g(x))g ; (x) from above. 
Since f(x) = \fx^ we have f f (x) = 1/2^/J; so 


h'{x) = f'(g(x))g'(x) = -^-j==g'{x). 


Now substitute x = 5 to get 


Z/(5)= 


2 ^ 5 ( 5 ) 

Since ^(5) = 4 and ^ 7 (5) = 7, we have 


5'( 5 ). 




Y ⑸⑺ =!• 

One more example: suppose that j(x) = g(^/x), where g is as above. What 
is /(25)? Now we have j(x) = g{f(pc))\ here f(x) = \fx as before. This time, 
it works out that 


i'{x) = g\f(x))f'{x) = g r {Vx) 




So if a: = 25, we have 


f { 25)=g\V25)^==g\5)^ = ^ 


since ^ 7 (5) = 7. Compare these two examples: the order of composition makes 
a big difference! 


6 . 2.6 

◎ 


A 巧 举像儀 ample 

Let’s return to our function / from above: 


/(*) 


3x 7 + x 4 V2x 5 + 15x 4 / 3 — 23x + 9 
6x 2 — 4 



To find /’ ⑷， we have to synthesize / from easier functions using the rules 
from the previous sections. It’s not a bad idea to do this using the function 
notation (version 1 of all the rules above). Try this now! 

Meanwhile, I’m going to use version 2 of all the rules. We’ll set y = f(x) 
and try to find dy/dx. The first thing to notice is that y is the quotient of two 
things: u = 3x 7 + x A V2x 5 + 15x 4 / 3 — 23x + 9 and v = 6x 2 — 4. We’re going 
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to use the quotient rule to deal with the fraction, so we’ll need du/dx and 
dv/dx. The second of these is easy: it’s just 12x. The first is a bit harder. 
Let’s summarize what we know so far: 


u = 3a: 7 + + 12a: 4 / 3 — 23a: + 9 

S =??? 


6x 2 -4 


dx 


If we just knew du/dx, we could use the quotient rule and we’d be done. So 
let’s find du/dx. 

First, note that u is the sum of q = 3x 7 and the nasty quantity r defined 
by r = x 4 V2x 5 + 15x 4 / 3 — 23$ + 9. We need the derivatives of both pieces. 
The derivative of q is easy: it’s just 21a: 6 . Now, r is the product oi w = x 4 
and z = a/2x 5 + 15a; 4 / 3 — 23$ + 9, so we’ll have to use the product rule to 
find dr I dx. We’ll need to note the following: 

w = x A z = V 2a: 5 + 15a: 4 / 3 — 23a: + 9 


dw 


= 4x 3 


Darn, we don’t know what dz/dx is. We’re going to need to find that. Here 
we are taking the square root of a big expression, so let’s call it t. Specifically, 
iit = 2x 5 + 15x 4 〆 3 — 23$ + 9, then 2 ： = y/t. Now we can actually differentiate 


everything! Let’s set up one last table: 

t = 2x 5 -^ 15 x 4/s - 23a: + 9 
dt 
dx 


10a: 4 + 20$ 1/3 — 23 


z = Vt 

dz _ 1 

dt — 2Vt 


By the chain rule (changing the variables to the letters we need), 

£ == ^ ( 10a：4 + 20x1/3 ~ 23 ) ■ 

Replacing t by its definition, 2a; 5 + 15a; 4 / 3 — 23a; + 9, we see that 
dz _ 10x 4 + 20X 1 / 3 - 23 

dx — 2V2x 5 + 15x A / 3 - 23x + 9 

Great —— we finally know dz/dx. Now we can fill in the question marks from 
above: 

z = V2a; 5 + 15a; 4 / 3 - 23a; + 9 
dz_ _ 10x 4 + 20a: 1 / 3 - 23 

▲ — 2\/2a; 5 + 15x 4 / 3 -23a; + 9' 


dw 


=x 4 
= 4 ： x 3 


Now look back above — we were trying to find dr!dx, where r = wz. Let’s use 
the product rule: 

dr dw dz 
dx ~ Z dx JtW dx' 

Again, notice that you have to be flexible with the variables —— they’re not 
always u and v\ Anyway, if you substitute from the table above, you find that 


dr 

dx 


(y/2x 5 + 15a: 4 / 3 - 23a; + 9) (4a; 3 ) + (a; 4 ) 


10a; 4 + 20a: 1 / 3 - 23 
2\/2a; 5 + 15x 4 / 3 -23a; + 9' 
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Taking a common denominator and simplifying reduces this (check it!) to 


dr _ 26x 8 + 140x 13 / 3 — 207x 4 + 72x 3 
dx ~ 2V2x 5 + 15x 4 / 3 - 23x -1- 9 . 

Now go back to u. We saw that u = q r, where we have q = 3x 7 and 

r = + 15a: 4 / 3 — 23a: + 9. We know that dq/dx = 21a: 6 ，and we just 

worked out the messy formula for dr/dx, so we just add them together to get 

du ^ a 26a: 8 + 140 x 13/3 - 207a: 4 + 72x 3 

—： — = 21x - {- - , i — . 

dx 2V2x 5 + 15a: 4 / 3 - 23a: + 9 

Finally, we can come back to our original quotient rule computation from the 
top of the previous page, and fill in du/dx.. 

u = 3x 7 + x A y/2x b + 15a; 4 / 3 — 23x + 9 v = 6x 2 — 4 

du 6 26a: 8 + 140a: 13 / 3 - 207a; 4 + 72a; 3 dv 

— = 2lx^ H - 7 = — — = 12a;. 

dx 2V2x 5 + 15x 4 /3 - 23a: + 9 dx 

Since y = u/v, we just use the standard quotient rule 


dy_ 

dx 


to see (after splitting up and canceling) that 


丨 26a; 8 + 140x 13 / 3 - 207x 4 + 72a; 3 
= X 2V2x 5 + 15x 4 /3 - 23x + 9 
dx 6x 2 — 4 

(3:r 7 + x A V2x 5 + 15a; 4 / 3 — 23$ + 9) (12a:) 
(6a: 2 — 4) 2 


We’re finally done! It’s certainly not pretty, but it’s certainly effective. 

6.2.7 Justification of the product rule and the chain rule 

You can find formal proofs of the product rule and chain rule in Sections A.6.3 
and A.6.5 of Appendix A, but it’s not a bad idea to get an intuitive idea for 
why these rules work. So let’s take a quick look. 

In the case of the product rule, we’ll use version 2 of the rule from Sec¬ 
tion 6.2.3 above. We start off with two quantities, u and v, which both depend 
on some variable x. We want to see how the product uv changes when we 
change a; by a tiny amount Ax. Well, u will change to u Au, and v will 
change to t; + Av^ so the product changes to (u + Au)(v + Av). We can 
visualize this by thinking of a rectangle with side lengths u and v units. The 
rectangle changes shape a little bit so that its new dimensions are u-\- Au and 
v H- At; units, like this: 
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uv 

v —卜 

(u + Aw)(-y + Av) 

u 




u H- Au 


The products uv and (u-\-Au)(v-\-Av) are just the areas of the two rectangles 
in square units, respectively. So how much does the area change? Let’s see 
by superimposing the two rectangles: 


v 


Av 


The change in areas is precisely the area of the shaded L-shaped region. This 
region is made up of two thin rectangles (of areas vAu and uAv square units) 
and one little one (of area (Aw)(At;) square units). Since the change of areas 
is A(wv) square units, we have shown that 

A(m;) = vAu H- uAv + (Aw)(At;). 

When the quantities Au and Av are very small, the little area is very very 
small indeed, so we can basically ignore it. Here’s what we’re saying: 

A(uv) = vAu + uAv. 

If you divide by Ax and take limits, the approximation becomes perfect and 
we get the product rule 


UV vAu 


uAv : AuAv 

u Au 


d , 、 du dv 

— (uv) = v— 
ax ax ax 


This is actually pretty close to the real proof! 

Before we move onto the chain rule, let’s prove the product rule for three 
functions, which is (as we saw above) given by 


du 


dv 

• u—i 
ax 


The trick is to let ^ = vw, so that uvw is just uz. We can use the product 
rule on z = vw first: 

dz dv dw 
d^ = w d^ +v d^- 
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Now use the product rule on uz to get 


d , 、 d , 、 du dz 
— iuvw) — —(uz) = 2 ：- —— |- u—. 
ax dz ax ax 


All that’s left is to replace z by vw and dz/dx by the above expression to see 
that 


du dz du ( dv dw\ 

(uvw) = 2 ： - —— I- u— = vw - —— h u w - —— |- v— 
ax ax ax \ ax ax J 


If you expand this, you get the desired formula. 

Finally, let’s think about the chain rule for a little bit. Suppose y = f(u) 
and u = g(x). This means that u is a function of x, and y is a function of u. 
If we change a: by a little bit, as a result u will also change by a little bit. As 
a result of that, y will change too. By how much will y change? 

Well, let’s start off by concentrating on u and seeing how it reacts to a small 
change in x. Remember that u = g(x); so as we discussed in Section 5.2.7 
in the previous chapter, the change in u will be approximately g r {x) times 
the change in x. You can think of g r (x) as a sort of stretching factor. (For 
example, if you stand in front of one of those amusement park mirrors that 
make you look twice as tall and skinny as you are, then stand on your toes, 
your reflection will rise by twice as much as you do.) Here’s an equation that 
describes this: 

Au = g\x) Ax. 

Now we can repeat the exercise with y in terms of u. Since y = f(u), a change 
in u will produce approximately f f (u) times as much change in y: 

Ay = f(u) Au. 


Putting these two equations together, we get 

Ay ^ f'(u)g'{x) Aa:. 

So the change in x is first stretched by a factor of g\x), then again by a factor 
of f r (u). The overall effect is to stretch by the product of the two stretching 
factors f\u) and g f (x) • (After all, if you stretch a piece of chewing gum by 
a factor of 2, then stretch that by a factor of 3, this would be the same 
as stretching the original piece of gum by a factor of 6.) This last equation 
suggests that 

f '( 構 ). 

From here, you can get to either of the two versions of the chain rule without 
too much difficulty. To get version 1, remember that u = g(x) and y = f(u), 
so that y = then let y = h(x) and rewrite the above equation as 

h'{x) = f'(u)g'{x) = f'{g(x))g'{x). 

To get version 2, just interpret f r (u) as dy/du and also g f (x) as du/dx, so 
that the above equation for dy/dx says that 

dy dy du 
dx du dx 

The above explanation isn’t a formal proof, but it’s pretty close. 
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6.3 Finding the Equation pf a Tang 翁 Liri_ 

What’s the use of finding derivatives, anyway? Well, one benefit is that you 
can use derivatives to find the equation of a tangent line to a given curve. 
Suppose you have a curve y = f(x) and a particular point (x,f(x)) on the 
curve. Then the tangent line through that point has slope f f {x) and passes 
through the point (x, f(x)). Now you can just use the point-slope form to 
find the equation of the tangent line. In gory detail: 



1. find the slope, by finding the derivative and plugging in the given value 
of X] 

2. find a point on the line, by substituting the value of x into the 
function itself to get the ^/-coordinate. Put the coordinates together and 
call the resulting point (xo,yo)- Finally, 

3. use the point-slope form y — yo = — a ： o) to find the equation. 

Here’s an example. Let y = (x s — 7) 50 . What is the equation of the tangent 
line to the graph of this function at a: = 2? First we need the derivative. We’ll 
have to use the chain rule, as follows: let u = x s — 7, so y = u 50 . Then we 
have dy/du = 50w 49 and du/dx = 3a: 2 . By the chain rule, 


^ = ^^=50^x3^ 
ax au ax 


150a: 2 (a: 3 - 7) 49 . 


(Remember, we have to replace w by a: 3 — 7 in order to get everything in terms 
of x.) Now we need to plug in a: = 2; for this value of x, we have 


= 150(2) 2 (2 3 - 7) 49 = 150 x 4 x l 49 = 600. 

Great — we’ve found the slope of the tangent line we’re looking for. Now we 
need to find a point it goes through: just put x = 2 and see what y is. In 
fact, y = (2 s — 7) 50 = l 50 = 1. So the tangent line passes through (2,1). 
Using the point-slope form, we see that the equation of the tangent line is 
(y — 1) = 600($ — 2), which you can rewrite as y = 600x — 1199 if you like. 
And that’s all there is to finding tangent lines! 


6.4 Velocily ond Acceleration 

Another application of finding derivatives is to compute velocities and acceler¬ 
ations of moving objects. In Section 5.2.2 of the previous chapter, we imagined 











Suppose you throw a ball directly up in the air. It goes up and comes back 
down (unless it hits something or someone else catches it!). This is because 
the Earth’s gravitational pull exerts a force on the ball, pulling it toward the 
Earth. Newton — one of the pioneers of calculus — realized that the effect of 
the force is that the ball moves downward with constant acceleration. (We’ll 
assume that there’s no air resistance.) 

Since the ball is going up and down, we’d better reorient our number line 
so that it points up and down. Let’s set the 0 point as being on the ground, 
and we’ll make upward positive. Since the acceleration is downward, it must 
be a negative quantity, and since it’s constant, we can just call it —g. On 
Earth, g is about 9.8 meters per second squared, but it’s a lot less on the 
moon. Anyway, if we’re going to understand how this ball moves, we need to 
know its position and its velocity at time t. 

Let’s start off with velocity. We know that a = dv/dt. In the example 
in the previous section, we knew what v was, so we differentiated it to find 
a. Unfortunately, this time we know what a is (it’s the constant —g) and 
we need to find so we’re all topsy-turvy here. The same thing happens 
for x, once we know v. In both cases, we need to reverse the process of 
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It’s not hard to check that these equations are consistent. Differentiating 
with respect to t, we see that dv/dt = —g, which is equal to a; and that 
dx/dt = —gt + u, which is just v. So a = dv/dt and v = dx/dt after all. Also, 
when 亡 = 0, we see that v = u and x = h. This means that the initial velocity 
is u and the initial height is h. Everything checks out. 

Now, let’s look at an example of how to use the above formulas. Suppose 
you throw a ball up from a height of 2 meters above the ground with a speed 
of 3 meters per second. Taking ^ to be 10 meters per second squared, we want 
to know five things: 


1. How long does it take for the ball to hit the ground? 

2. How fast is the ball moving when it hits the ground? 

3. How high does the ball go? 

4. If instead you throw the ball at the same speed but downward, how long 
does the ball take to hit the ground? 

5. In that case, how fast does it hit the ground? 

In the original situation, we know that g = 10, the initial height /i = 2, and 
the initial velocity u = 3. This means that the above formulas become 


a = — 10, v — — 10 亡 + 3, 


and x = 


- i ( 10)^ 2 + 3i + 2 ： 


-5i 2 + 3t + 2. 


For part 1, we need to find how long it takes for the ball to get to the ground. 
This surely happens when its height is 0. So set x = 0 and let’s find we get 
0 = —5t 2 + + 2. If you factor this quadratic as —(5t + 2)(^ — 1), you can 

see that the solution of our equation is t = 1 or t = —2/5. Clearly the second 
answer is unrealistic —— the ball can’t hit the ground before you even throw it! 
So the answer must be t = 1. That is, the ball hits the ground 1 second after 
we throw it. 

For part 2, we need to find the speed at the time when the ball hits the 
ground. No problem — we know that v = —lOt + 3, and that the ball hits the 
ground when t = 1. Plugging that in, we see that v = —10 + 3 = —7. So the 
velocity of the ball when it hits the ground is —7 meters per second. Why 
negative? Because the ball is going downward when it hits, and downward is 
negative. The speed of the ball is just the absolute value of the velocity, or 7 
meters per second. 

To solve the third part, you have to realize that the ball reaches the top 
of its path when its velocity is exactly 0. On the way up, the velocity is 
positive; on the way down, the velocity is negative; it must be 0 when it’s 
changing from up to down. So, when is v equal to 0? We just need to solve 
—10 尤 + 3 = 0. The answer ist = 3/10. That is, the ball reaches the top of its 
trajectory three-tenths of a second after we release it. How high is it then? 
Just plug t = 3/10 into the formula x — —5t 2 + + 2 to see that 


x 



+ 2 = 


That is, the ball reaches a height of 49/20 meters above the ground. 
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For the last two parts, you’re throwing the ball downward instead. We still 
have ^ = 10 and the initial height /i = 2, but what is the starting velocity u? 
Don’t make the mistake of thinking that u is still 3! Since you are throwing 
the ball downward, the initial velocity is negative. A speed of 3 meters per 
second downward translates into an initial velocity u = —3. Omitting this 
minus sign is a common mistake, so be warned. Anyway, our equations are 
now 

a = —10, v = —lOt — 3, and x = —^(10)i 2 — 3^ + 2 = —5t 2 — 3^ + 2. 

Notice how similar they are to the equations for the scenario when we threw 
the ball upward. To solve part 4 of the problem, we need to find the time 
the ball hits the ground. Just as we did in part 1, set x = 0] then we have 
0 = -5t 2 — 3t-\-2 = — Sot = 2/5 or ^ = — 1. This time we reject 

t = —1, since it’s before we threw the ball, so we must have t = 2/5. That 
is, the ball hits the ground two-fifths of a second after we throw it. It makes 
sense that it’s less than the time taken when we threw the ball up (which was 
1 second), since the ball doesn’t have to go up and then down. For the final 
part, we want to see how fast the ball is moving when it hits the ground; so put 
t = 2/5 in the formula for velocity. We get v = —10(2/5) — 3 = —4 — 3 = —7. 
Once again, the ball hits the ground with a speed of 7 meters per second. 
Interesting that it doesn’t matter whether you throw the ball up or down (as 
long as it’s from the same height and with the same speed in both cases): it 
still hits the ground with the same speed, although the time taken to hit the 
ground is different. 


Limits Which Are Derivatives in Disguise 


That’s enough motion for now. Consider how you’d find the following limit: 

r v / 32 + ft-2 

lim -:- • 

h — h 


It looks pretty hopeless. Even the trick of multiplying by the conjugate-type 
expression ^32 + h + 2 doesn’t work because it’s a 5th root, not a square root 
(try it and see for yourself!). So let’s take a break from this and consider a 


related limit: 


1^0 


\/x + h — ^fx 
h 


Note that h, not x, is the dummy variable here. Now this limit looks pretty 
difficult too, but perhaps it rings a bell. It’s pretty similar to the limit in our 
formula 




/(x + fe) - f(x) 
h 


=/’ ⑷. 


All you have to do is set f(x) = and note that f f (x) = 一 4 〆 5 . (Here 
we wrote ^fx as a: 1 〆 5 in order to find the derivative.) The derivative equation 
becomes 


h — h 











11 8 • How to Solve Differentiation Problems 


So the limit on the left is a derivative in disguise! We had to create a function 
f and differentiate it to solve the limit. 

Now we can return to the original limit 

V32Th-2 
h — >0 h 


This is actually a special case of the limit 

\/x + h - ^fx 
h — h 5 

which we just worked out. If you set a: = 32 in this limit, you get 

lim -气 ix32 _ 4/5 . 


Since ^32 = 2 and 32_ 4 / 5 = 1/16, we have shown that 

r VS2Th-2 1 QO _ 4/5 1 1 1 

lim --- = - x 32 4/0 = - x —=—. 

h^o h 5 5 16 80 



Make no mistake: this is hard. There is a double disguise here: not only 
are we dealing with a derivative, we’re actually evaluating the derivative at a 
particular point (32 in this case). You’re better off generalizing the situation 
first, then substituting the specific value of x. Here’s another example: 

^(4 + /i)3-7(4 + /i)-6 
h • 


This one could be done by multiplying top and bottom by the conjugate, but 
it’s also a derivative in disguise. Since we are dealing with 4+fe, let’s try replac- 
ing 4 by x. The first term in the numerator becomes \/(x h) s — 7(x h). 
This suggests that we might try setting f(x) = Vx 3 — lx. In Section 6.2.5 
above, we saw that f{x) = (3a: 2 — 7)/2\/x 3 — 7x^ so the equation 

f(x + h)- f{x)=, 

h ） 


becomes 

.. -v/(x + h) s — 7(x + h) — V X s — 7x 3x 2 — 7 

“o h 2y/x s — lx 

Finally, if you put x = 4, and simplify everything (noticing that you have 
Vx 3 — 7x = \/64 — 28 = v^36 = 6), you get 


^/(4 + /1) 3 -7(4 + /1)- 6 _ 3(4) 2 -7 _ 41 
h = 2(6) = 12* 

If you get stuck on a limit, it might be a derivative in disguise. Telltale 
signs are that the dummy variable is by itself in the denominator, and the 
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numerator is the difference of two quantities. Even if this doesn’t happen, 
you could still be dealing with a derivative in disguise; for example, 

(x + h) 6 — x 6 


has the dummy variable in the numerator. No matter — just flip it over and 
find this limit first: 


lim ㈣ 工 6 

h — >0 h 


To do this, set f(x) = x 6 , so that f(x) = 6x 5 . We have 


lim (X + / f = lim /(X + /l j - /(3：) = / '㈤ = 6/ 


Now just flip it over again and get 


(x + h) 6 — x 6 6x 5 

We’ll see a few other examples of limits which are derivatives in disguise in 
the future (in Chapters 9 and 17, to be precise). Keep your eyes peeled: many 
limits are derivatives in disguise, and your job is to unmask them.* 


6.6 Derivatives of Piecewisel^ned Functions 



Consider the following piecewise-defined function /: 


fix)= 



if a: < 0, 
if a: > 0. 


Is this function differentiable? Let’s graph it and see: 



* Actually, if you use l’H6pital’s Rule (see Chapter 14), you often don’t even need to 
recognize when a limit is a derivative in disguise. 
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Looks pretty smooth — no sharp corners. In fact, it’s pretty obvious that the 
function / is differentiable everywhere except perhaps at a: = 0. To the left of 
a: = 0, the function / inherits its differentiability from the constant function 
1, and to the right of x = 0, it inherits its differentiability from x 2 + 1. The 
question is, what happens at a: = 0, the interface between the two pieces? 

The first thing to check is that the function is actually continuous there. 
You can’t have differentiability without continuity, as we saw in Section 5.2.11 
of the previous chapter. To see that / is continuous at a: = 0, we need to show 
thatjin^/(a:) = /(0). Well, we can see from the definition of / that /(0) = 1. 
As for the limit, let’s break it up into left-hand and right-hand limits. For the 
left-hand limit, we have 


lim f(x) = lim (1) = 1 ， 

x—^0 - x^0 _ 

since f(x) = 1 when x is to the left of 0. As for the right-hand limit, 
lim + f(x) = lim + (a; 2 + 1) = 0 2 + 1 = 1, 

since f(x) = x 2 -\-l when x is to the right of 0. So the left-hand limit equals the 
right-hand limit, which means that the two-sided limit exists and is 1. This 
agrees with /(0), so we have proved that / is continuous at a: = 0. (Notice 
that for both the left-hand and right-hand limits, you effectively just have to 
substitute a: = 0 into the appropriate piece of / to get the limit.) 

We still need to show that / is differentiable at a: = 0. To do this, we have 
to show that the left-hand and right-hand derivatives match at a: = 0 (see 
Section 5.2.10 in the previous chapter to refresh your memory of left-hand 
and right-hand derivatives). To the left of 0, we have f(x) = 1, so f f (x) = 0 
in this case. It turns out that we can push it all the way up to a: = 0 like this: 

lim f f (x) = lim 0 = 0. 
x^0 - x—^o - 

This shows that the left-hand derivative of / at a: = 0 is 0. (See Section A.6.10 
of Appendix A for more details.) To the right of 0, we have f(x) = x 2 + 1, so 
f(x) = 2x there. Again, we can push this down to a: = 0: 

lim f(x) = lim 2a: = 2 x 0 = 0. 

x—^0+ x^0+ 



So the right-hand derivative of / at a: = 0 is 2 x 0 = 0. Since the left-hand and 
right-hand derivatives at a: = 0 match, the function is differentiable there. 

So, to check that a piecewise-defined function is differentiable at a point 
where the pieces join together, you need to check that the pieces agree at the 
join point (for continuity) and that the derivatives of the pieces agree at the 
join point. Otherwise it’s not differentiable at the join point.* If you have 
more than two pieces, you have to check continuity and differentiability at all 
the join points. 


* Actually, this is only true if the left- and right-hand limits of the derivatives at the 
join points exist and are finite. See Section 7.2.3 in the next chapter for an example of this. 
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Let’s look at one more example of differentiating a piecewise-defined func¬ 
tion. Suppose that 


9(x)= 


\x 2 — 4| if a: < 
—2x + 5 if a: > 


Where is g differentiable? You might think that the only issue is at the join 
point x = 1, but actually the absolute value makes life more complicated. 
Remember, the absolute value function is really a piecewise-defined function 
in disguise! In particular, \x\ = x when x > 0, but |a:| = —x when a; < 0. It 
follows that 

far 2 -4 if a: 2 - 4 > 0, 

I-(a: 2 -4) if a: 2 - 4 < 0. 


\x 2 — 4| = 


In fact, the inequality a: 2 — 4 < 0 can be rewritten as x 2 < 4, which means 
that —2<x<2. (Be careful to include the —2<x bit as well as the more 
obvious x <2 bit!) So we can simplify this a little to get 


l* 2 -4| = 


x 2 — 4 ： ii x>2 or x < —2, 
-x 2 + 4 if - 2 < a: < 2. 


Now, in the definition of g(x) above, the term \x 2 — 4| only appears when 
a: < 1. So, we can throw everything together and remove the absolute values 
for once and for all, rewriting g(x) as follows: 


9{x)= 


x<-2, 
-2<x<l, 
x> 1. 


So actually there are two join points: x = —2 and x = 1. Since the three 
pieces making up g are differentiable everywhere, we know that g itself is 
differentiable everywhere except perhaps at the join points. Let’s check the 
join points one at a time, starting with x = —2. First, continuity. From the 
left, we have 

lim^ g(x) = lim^ x 2 — 4= (—2) 2 — 4 = 0, 
while from the right, we have 


_ i ( im 2)+ 5 ⑷ 


lim 

c —( —2) 斗 


Since the limits are equal, g is continuous at x =— 
derivatives: for the left-hand derivative, we have 


a； 2 +4 = -(-2) 2 +4 = 0. 

Now, let’s check the 


lim g f (x) = lim 2x ■ 

4(—2) - x^(-2) - 


2(-2) =-4, 


whereas for the right-hand derivative, we have 
lim Q^x) = lim —2x = 

x^(-2) + x^(-2) + 


: 2(-2) = 4. 
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It actually looks continuous everywhere and differentiable everywhere except 
where there’s that sharp corner at (—2,0). In particular, everything’s nice at 
the join point x = 1, just as we calculated. 

6.7 Derivative Graphs Directly 

Suppose you have a graph of a function but not its equation, and you want 
to sketch the graph of its derivative. Formulas and rules aren’t going to help 
you here: instead, you need a good understanding of differentiation. 

Here’s the basic idea. Imagine the graph of the function as a mountain, 
and imagine that there is a little mountain-climber walking up and down the 
graph from left to right. At each point of the climb, the climber calls out 
how difficult he or she thinks the climb is. If the terrain is flat, the climber 
calls out the number 0 for the degree of difficulty. If the terrain goes uphill, 
the climber calls out a positive number; the steeper the climb, the higher the 
number. If the terrain goes downhill, then the climb is actually easy, so the 
degree of difficulty is negative. That is, the climber will call out a negative 
number. The more downhill the terrain, the easier it is, so the number will 
be more negative. (If it’s really steep going downhill, it might be difficult to 
climb down safely, but it’s certainly easy to descend quickly!) 

One important point: the height of the mountain itself isn’t relevant. It’s 
only the steepness that matters. In particular, you could shift the whole graph 
upward, and the climber would still be calling out the same degree of difficulty. 
A consequence of this is that if you are drawing the graph of a derivative from 
the graph of a function, the ^-intercepts of the function are not important! 

■fill Let’s look at an example: sketch the derivative of the following fearsome- 
looking function: 



Don’t panic. Just draw a little mountain-climber at a whole bunch of 
different points and imagine the climber shouting out the degree of difficulty 
at each point. Then all you have to do is plot these degrees of difficulty on 
another set of axes. Of particular interest are the points where the path is 
flat; this can occur in a long flat section (such as between x = 5 and a; = 6 in 
the above graph), or at the top of a crest (such as at a: = —5 or a; = 1) or at 
the bottom of a valley (such as at a; = —2 or a: = 3). You definitely want to 
draw the mountain-climber there. Here’s what the graph of / looks like with 
the climber in a bunch of positions: 
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Now let’s draw a set of axes for our graph of the derivative. Label the y-axis 
as “degree of difficulty,” ranging from hard, down to flat at the origin, down 
to easy. Then you should be able to pencil in some points based on what the 
various copies of the little climber have shouted out. Remember, the climber 
doesn’t care how high the mountain is, only how steep the climb is! Based 
on this, you get the following points: 



hard 

« 

參 

6 , flat 

y = /’ ⑷ 

1 • 參 》 • o 

參 

• 

-6 -5 - 

4 -3 -2 -1 

0 1 2 3 4 5 6 * 

7 8 9 


參 

• • # 


• 





• 

參 


• 

• 




easy 




Here’s a detailed explanation of how we came to these conclusions: 

• At the far left of the graph of y = f(x), the climber starts out going 
only slightly uphill. So we’ll plot points of height a little above 0. 

• Moving along to x = —6, the climber starts to go uphill, so the difficulty 
has gone up, so the points get higher (more difficult). 

• Then it starts getting a little easier, until when x = —5, the climber 
has reached the top of the crest and it’s now fiat. In particular, when 
x = —5, the derivative has an ar-intercept. 

• After x = —5, the original curve starts to go downhill, first gently and 
then more and more steely. This means that it’s getting easier and 
easier, until it gets ridiculously easy. So the derivative also has a vertical 
asymptote at x = —4. 

• On the other side of the asymptote, the climb is also really easy — the 
climber is going downhill, starting very steeply and then leveling out at 
the valley when x = —2. So the vertical asymptote on the derivative 
curve actually starts at —oo (really easy) and climbs up to 0 at a: = —2. 
(The fact that there are ^-intercepts between —5 and —4 and also be¬ 
tween —4 and —3 is irrelevant. The ar-intercepts of the original function 
don’t matter.) 
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• After the bottom of the valley at x = —2, the climber has to go uphill 
for a while, so it gets harder. After x = 0, though, it gets a little easier, 
until he or she reaches the top of the hill at x = 1. This means that 
the derivative curve goes up until a: = 0, then comes back down to an 
^-intercept at a: = 1. 

• The reverse happens on the way to the bottom of the valley at a: = 3: 
it gets easier and easier until x = 2, then it flattens out while still being 
downhill. So the derivative curve goes down, reaches a minimum at 
x = 2, then comes back up for an ^-intercept at x = 3. 

• From the bottom of the valley at a: = 3, the climb gets steadily harder 
until x = 4. Between x = 4 and x = 5, however, the climb is of uniform 
difficulty, since the slope is constant. So the derivative curve increases 
from x = 3 until x = 4, but then it stays at the same height (degree of 
difficulty) between x = 4 and x = 5. 

• At x = 5, the slope abruptly changes — it becomes flat without any 
warning, then stays flat until x = 6. So the derivative curve must 
jump down to 0 and stay there until x = 6. The derivative will have a 
discontinuity at a: = 5. 

• After x = 6, the climber finds things easier and easier as the curve dips 
down to the vertical asymptote at x = 7. The derivative curve also has 
a vertical asymptote there. 

• To the right of the vertical asymptote, the climb is extremely difficult, 
but it does get a little easier as x moves up to 9. So the derivative curve 
will start very high on the right side oi x = 1 and then get a little lower 
as the climb becomes easier. 

Now, just connect the dots! Here are the graphs oiy = f(x) and y = f r (x): 
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Let’s just summarize the ideas that we used: 

• When the original graph is flat, the derivative graph has an ^-intercept. 
In the above example, this occurs at a: = —5, x = —2, x = 1, a: = 3, and 
everywhere in the interval [5,6]. 

• When a portion of the original graph is a straight line, the derivative 
graph is constant (this occurs in the interval [4,5] in our example). 

• If the original has a horizontal asymptote, it’s often true that the deriva¬ 
tive also has one, but in that case it will be at y = 0 instead of the original 
height of the asymptote (as at the left-hand edge of our example). 

• Vertical asymptotes in the original usually lead to vertical asymptotes in 
the derivative at the same place,* although the directions may change. 
For example, in our graph above, eXx = l the original curve goes to —oo 
on both sides of the asymptote, but the derivative has opposite signs. 
The vertical asymptote at a: = —4 is similarly affected. 

When in doubt, use the trusty mountain-climber! 


*It’s not actually true in general that if a function has a vertical asymptote, then its 
derivative also has a vertical asymptote at the same place. An example is y = l/x+sin(l/a;) 
at a; = 0. Can you see why? 





CHAPTER 7 


Trig Limits and Derivatives 


So far, most of our limits and derivatives have involved only polynomials 
or poly-type functions. Now let’s expand our horizons by looking at trig 
functions. In particular, we’ll focus on the following topics: 

• the behavior of trig functions at small, large, and other argument values; 

• derivatives of trig functions; and 

• simple harmonic motion. 


7.1 Limits Involving Trig Fyficficiins 


Consider the following two limits: 

lim ^M and lim^l 

X x—^oo X 

They look almost the same. The only difference is that the first limit is taken 
as a: — 0 while the second is taken as a; — oo. What a difference, though! As 
we’ll soon see, the answers and the techniques used have almost nothing in 
common. So, it’s really important to note whether your limit involves taking 
the sine — or cosine or tangent — of really small numbers (as in the first limit 
above) or really large numbers (as in the second limit). We’ll look at these 
two cases separately, then see what happens when neither case applies. 

Before we do, it’s important to note that you can’t tell what case you’re 
dealing with just by looking at whether a: ^ 0 or a; ^ oo. You need to 
see where you are evaluating your trig functions. For example, consider the 
following pair of limits: 


lim x sin 



and 


lim a; sin 


© 


In the first limit, you are taking the sine of 5/x, which is actually a huge 
number (positive or negative, depending on the sign of x) when x is near 0. 
So the first limit isn’t covered by the small case at all —— it belongs to the large 
case! Similarly, in the second limit, the quantity 5/a; is very small when x is 
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large, so that’s really the small case. We’ll solve all four of the above limits 
in the next few sections. 

7.1.1 The small case 

We know sin(O) = 0. OK, so what does sin(x) look like when x is near 0? 
Sure, sin (a:) is near 0 as well in that case, but how near to 0 is it? It turns 
out that sin(a:) is approximately the same as x itself! 

For example, if you take your calculator, put it in radian mode, and find 
sin(O.l), you get about 0.0998, which is very close to 0.1. Try it with a number 
even closer to 0 and you’ll see that taking the sine of your number leaves you 
with something very close to your original number. 

It’s always good to look at a picture of the situation. Here’s a graph of 
y = sin ⑷ and y = x on the same set of axes, concentrating only on the values 
of x between —1 and 1 (approximately) : 



The graphs are very similar, especially when x is close to 0. (Of course, if we 
graph a little more oiy = sin(x), it starts making the familiar waves; it’s only 
when we zoom in like this that we see how close sin(x) is to x.) So there’s 
good justification for making the statement that sin(a:) is close to x when x 
is small. If sin ⑷ were actually equal to x, then 

sin ⑷ i 

x 

would be true. In fact, the above equation is never true, but it is true in the 
limit as x —> 0: _ 

X 

This is very important. It’s basically the key to doing calculus involving 
trig functions. We’ll use it in Section 7.2 to find the derivatives of the trig 
functions, and we’ll actually prove it in Section 7.1.5 below. 

How about cos(x)? Well, cos(0) = 1, so things are very different in this 
case. For the moment, let’s just say that the cosine of a small number is very 
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close to 1. We write _ 

lim cos(a:) = 1 

taking special care to notice that there’s no factor of x in the denominator 
as there is in the previous formula involving sin ⑷. What if you do put a 
factor of x in the denominator? We’ll see very soon, but first I want to look 
at tan(a:). 

The key is to write tan(a:) as sin ⑷ /cos ⑷. The numerator is sin(ar )， 
which is close to x when x is small. On the other hand, the denominator is 
close to 1. If there’s any justice in the world, then the ratio should behave 
like x/1, which is just x. In fact, this is true, as we can see by isolating the 
harmless factor cos(x) in the denominator: 


sin(a:) 

lim ㈣ =lim = lim (= ⑴ (~)=^ 

0 X X^O X \ X ) \COS[X) ) \1 J 

So we have shown that _ 


卜 o a; I 

This means that sin(a:) and tan(a:) behave in a similar way when x is small, 
but cos ⑷ is the odd one out. Let’s take a look at what happens to cos{x)/x 
as a: — 0. So we are trying to understand 

lim 幽 . 

x-j-0 X 

If you just substitute x = 0, then you get 1/0. This means that the graph of 
y = cos(a:)/a: has a vertical asymptote at a; = 0. It looks very much like 1/x 
for small x\ in particular, you should try to convince yourself that 

v cos ⑷ cos ⑷ cos ⑷ 

lim - = oo, lim - = —oo, so lim - DNE. 

X-J-0+ X x-fO- X x-J-0 X 

(Remember, “DNE” stands for “does not exist.”）This is really different from 
what happens with sin or tan in place of cos. 


Solving prob1©rrr&=Hih0 錄 

Here’s a simple example: find 

sin(a: 2 ) 
lim — 
x—o x z 

First note that when x is near 0, so is a; 2 , so we are indeed taking the sine of 
a small number. Now, we know that the following limit holds: 

x-j-0 X 

If you replace x by x 2 (which is a continuous function of x), then you get the 
following valid limit: 

li m 血 0 2 ) : = 丄 
x 2 ^0 X 2 * 
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This is almost the limit we want. In fact, the only thing we need to note is 
that a; 2 —> 0 when a; —> 0, so we can finally evaluate our limit as 

sin(a? 2 ) 

lim --— = 1. 

x z 


Of course, there’s nothing special about x 2 \ any other continuous function of 
x that is 0 when a: = 0 will do. In particular, we know all the following limits 
automatically: 


lim 響 =1; lim = 1; and even lim 

5a: 3x ( x-j-o sm ⑻ 


These are all true with “sin” replaced by “tan，” but not by “cos ”！ Anyway, 
we can summarize the whole situation by noting that 


sin (small) 
x—o same small 


and 


tan (small) 

: d same small 


It’s vital that the denominator matches the argument of sin or tan in the 
numerator, and also that this quantity is small when x is small. Of course, 
for cosine, the best we can say is 


lim cos (small) = 1. 


© There’s no need to worry about matching anything in this case! 

Now let’s return to one of the examples from the beginning of the chapter: 

lim 5^. 
x-»-0 X 

The problem is that we are taking the sine of 5x, but we only have x in the 
denominator. These two quantities don’t match. Never mind — we’ll take that 
sin(5$) term and divide it by 5x, which does match, then multiply it again to 
make it work out. That is, we’ll rewrite sin(5a:) as 


sin(5a:) 


x (5a:). 


This is almost the same trick as the one we used in Section 4.3 of Chapter 4 
for limits involving rational functions! Let’s see how it works in this case: 


sin(5:r) 

X x^O X 


Now keep the sin(5x)/5x part together, but cancel out a factor of x from the 
other two factors to get 




sin(5a;) 


= 1^0 


sin(5a;) 


x 5, 
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As we saw above, since we have matched the bx terms — once in the denomi¬ 
nator and once in the argument of sin~~we know that the fraction has limit 
1, so the total limit is 5. In one line, the solution looks like this: 


◎ 


lim 

x—^0 


sin(5o:) 


sin(5a:) 

X (5®) 血 ㈣ K 

lim --- = lim ― ^~~ - x 5 = 

X x—^0 


5a: 

Now let’s check out a harder example. What is 

sin 3 (2x)cos(5x 19 )? 
xtan(5x 2 ) • 

Let’s look at the four factors of this expression one at a time. First, consider 
sin 3 (2x). This is just another way of writing (sin(2a:)) 3 . To deal with sin(2a:), 
we’d divide and multiply by 2x; so to deal with its cube, we divide and 
multiply by (2a:) 3 instead. That is, we’ll replace (sin(2a:)) 3 by 


(sin(2a:)) 3 

(2^) 3 


x (2a:) 3 


How about the cos(5ar 19 ) factor? Well, when x is small, so is 5a: 19 , so we are 
just taking the cosine of a small number. This should be 1 in the limit, so we 
don’t touch this second factor. 

In the denominator, we have a factor x, which we can’t do anything with 
(nor do we want to — it’s really easy to deal with already!). That leaves the 
tan(5a: 2 ) factor. We simply divide and multiply by (5a: 2 ), so that we will be 
replacing tan(5a: 2 ) by 

tan(5a: 2 ) 


5a: 2 

Putting all of this together, we have 




sin 3 (2x) cos(5x 19 ) 
xta,n(5x 2 ) 


= i i 2 1 o 


(5a; 2 ). 


■(sin(2o:)) 3 

(2x) 3 


x (2a:) 3 cos(5a: 19 ) 


tan(5x ) 
5a: 2 


x (5a: 2 ) 


Now let’s pull out all the powers of x that don’t match the trig functions: the 
(2a;) 3 term on the numerator and the x and (5a: 2 ) terms in the denominator. 
Then we rewrite the fraction (sin(2a:)) 3 /(2a:) 3 as (sin(2a:)/2a:) 3 and simplify 
to see that the limit becomes 


lim - 

X—>0 


(sin(2a0) 3 

(2a;) 3 


cos(5o: 19 ) 


tan(5x 2 ) 

5x 2 


㈣ 3 

x(5x 2 ) 


cos(5$ 19 ) 


lim - 

x — ^0 


tan(5x 2 ) 

5x 2 


8x 3 

5^* 


Finally, we can cancel out x 3 from top and bottom, and take limits. Since 
the sin and tan terms have matching numerators and denominators, and 
cos (small) —> 1, the limit is 


⑴ 3 ⑴ 


8 8 
5 = 5* 
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◎ 


Here’s another example from the beginning of the chapter: what is 
lim a: sin f — ) ? 

x—^oo yx) 

As we saw, this example does belong in this section, because when x is large, 
the quantity b/x is small. So we use the same method, in this case dividing 
and multiplying sin(5/a:) by 5/a:, to write: 


lim x sin 


0 " 


lim x' 




◎ 


◎ 


Now we can cancel out a factor of x to simplify this down to 
sin(5/a:) 


lim 5 x 


h/x 


Thinking of “small” as 5/a:, we can immediately see that the limit of the big 
fraction as a; —> oo is 1, and so the overall answer is 5. 

It’s also possible to have trig limits involving sec, esc, or cot. For example, 
what is 

lim sin(3a:) cot(5ar) sec(7a:)? 

To do this, the best bet is to write it in terms of cos, sin, or tan, as follows: 


cos(7:r)) 


Now we can do our standard trick of multiplying and dividing for the sin and 
tan terms, but ignoring the cos term, to see that the limit is equal to 


1 ^( 


sin(3a:) 

3x 


x (3a:) 


tan(5a:) 

5a: 


x (5a:) 


,cos(7a;) / 


Now the (3a;) and the (5x) terms cancel to leave 3/5, and all the other fractions 
tend to 1 in the limit, so you can see that the overall limit is 3/5. 

There is one thing you have to be very careful of: when you say that sin(a:) 
behaves like x when x is small, you should only use this fact in the context of 
products or quotients. For example, 

x — sin(a:) 


x 3 

cannot be done by the methods of this chapter. It is a mistake to say that 
sin ⑷ behaves like x, so x — sin(a:) behaves like 0. (In fact, nothing behaves 
like 0 except for the constant function 0 itself!) In order to solve the above 
limit, you need PHopitaPs Rule (see Chapter 14) or Maclaurin series (see 
Chapter 24). On the other hand, here’s a limit which has a similar difficulty 
that we can nevertheless solve now: 

1 — cos 2 ⑻ 
lim - 9 . 

: c —0 x z 
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Again, you can’t just say that cos ⑷ behaves like 1 when x is small, so 
1 — cos 2 ⑷ behaves like 1 — l 2 = 0. So we just use cos 2 (a:) + sin 2 (a:) = 1 
to rewrite the numerator as sin 2 (a:): 


lim 


• cos 2 (a;) 
x 2 


x—^0 X z 


Since sin 2 (a;) is another way of writing (sin (: r)) 2 , we can rewrite the limit as 


lim 


(sin ㈤ ) 2 




This limit is simply l 2 


So 


— cos 2 ⑷ 


X 2 


In effect, we’re saying that 1 — cos 2 (a:) behaves like x 2 when x is small, not 
like 0 after all. Anyway, let’s use the same idea to solve some other limits: 

1 - co S (;r) 1 - cos ⑷ 


X 2 


and 


lim - 

x—^0 


We’ll do both of these limits with the same clever trick. The idea is to multiply 
top and bottom by l+cos(x) so that the numerator becomes 1—cos 2 (a:), which 
we write as sin 2 (a:). In the first case, we have 

1 — cos(a:) 1 — cos (a:) 1 + cos ⑷ 


x 2 

=lim - 


二 0 ^ 
cos 2 (a:) 


cos (: r) 


- cos ⑷ 


： lim^ 

X—^0 X z 


COS ⑻ 


lim 

x-J-0 \ X J 


Here we used the fact that cos(0)= 


cos ⑷ 1 + 1 2 

1. The second example is similar: 


lim - 

x—^0 


COS(iT) 


cos(a:) 


Ds(a:) 


os(x) 


— cos 2 ⑷ 


lim 


sin 2 (a:) 


x 1 + cos ⑻ x-j -0 x 1 + cos ⑻ 

At this point, we could divide and multiply the sin 2 (x) term by a: 2 , but here’s 
a simpler way to handle the limit: simply write sin (a:) as sin(a:) x sin(a:), 
and group one of the sin(x) factors with the x in the denominator. The limit 
becomes 

sin(a:) 1 


lim [ 


sin(a:) x 


1 + cos(a;) J 

since sin(0) = 0. This last limit will be useful 
summarize it as something to keep in mind: 


0 x 1 x i j i =0, 
n Section 7.2 below, so let’s 


— cos(a:) 


= 0. 


Enough of the small case — let’s see how to deal with limits involving trig 
functions evaluated at large numbers. 
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7.1.3 The large case 

Consider the limit 

sin(x) 

lim —— —. 

X—^CX) X 

As we just saw, if a: —> 0 instead of oo, then the limit is 1. This is because 
sin(a:) behaves like x when x is small. How does sin(a:) behave when x gets 
larger and larger? It just keeps on oscillating between —1 and 1. So it doesn’t 
really “behave” like anything when x is large. Often one is forced to resort to 
one of the simplest things you can say about sin(x) (and also cos(rr)): 

I —1 < sin(a:) < 11 and I —1 < cos ⑷幺 11 for any x. 

This is pretty darn handy for applying the sandwich principle (see Section 3.6 
in Chapter 3). In fact, we saw on page 53 that 


lim ——— = 0. 

cc—^oo X 


◎ 


The gut feeling is that the sin(lla: 7 ) term isn’t doing much, so the top is really 
of size about x. The x 4 on the bottom should overwhelm the numerator, so 
the whole thing should go to 0 as a: ^ oo. In order to show this, let’s look at 
the numerator first. We know that the sine of any number is between —1 and 
1, so it’s true that 

-1 < sin(lla: 7 ) < 1. 

The numerator isn’t just sin(lla: 7 ), though: we need to multiply by x and 
then subtract 1/2. We can in fact multiply by x and then subtract 1/2 from 
all three “sides” of the above inequality to get 

-x - ^ <x sin(lla; 7 ) 

for any a: > 0. (If instead x < 0, which would be the case if the limit were as 
x —oo, then multiplying by the negative number x would just mean that 
you’d have to flip those less-than-or-equal signs around to become greater- 
than-or-equal signs. Otherwise the solution would be identical.) Anyway, 
that takes care of the numerator. We still need to divide by the denominator. 
Since 2x 4 > 0, we can divide the above inequality by 2x A to get 

—x—\ arsin(lla: 7 ) — \ x — \ 

2x 4 - 2X 1 - 2x 4 


Take a look back at the proof right now to refresh your memory. 

Remember how cos(x) is the odd one out when x is small? Unlike sin(a:) 
and tan(a:), it doesn’t behave like x itself. When x is large, on the other hand, 
tan(a:) is the odd one out. There are no inequalities for tan(x) similar to the 
boxed inequalities for sin(x) and cos(a:) above; this is because tan(a:) keeps 
on having vertical asymptotes and never settles down when x becomes large 
(see page 37 for the graph of y = tan(a:)). 

Here’s a much harder example using the sandwich principle: find 

a:sin(lla: 7 ) — \ 
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This is all we need. I leave it to you to use the methods of Section 4.3 of 
Chapter 4 to show that the limits of the outside terms are both 0 as a: ^ oo, 
that is, 

lim ^ ^ =0 and lim X / =0. 

x—^oo ZX^ x—^oo ZX^ 


(Don’t be lazy! These are pretty easy limits, but you should try to justify 
them now.) Now we invoke the sandwich principle; since our original function 
is trapped between two functions which tend to 0 as $ — oo, it also tends to 
0 then. That is, 


lim 


: sin(lla: 7 ) — \ 


2a: 4 


Another consequence of the inequality —1 < sin(a:) < 1 (and the similar 
one for cos(x)) is that you can treat sin (anything) or cos(anything) as being 
of lower degree than any positive power of so long as you are only adding 
or subtracting. More precisely, if you are solving a problem of the form 


lim 

x—^oc 


咖 ）’ 


where p and q are polynomials or poly-type functions but with some sines 
and cosines added on, then the degrees of the top and bottom are the same 
as they would be without the sines and cosines added on. The only exception 
is when p or q has degree 0; then the trig part could be significant. 

Here’s an example of how adding sines and cosines doesn’t make much of 
a difference: what is 

l.m 3x 2 + 2$ + 5 + sin(3000x 9 ) ? 

2a: 2 — 1 — cos(22a:) • 

In the numerator, the dominant term is still 3a: 2 , since the sin(3000a: 9 ) term 
is only between —1 and 1 and is insignificant in comparison. Compare this 
to the previous example, where we multiplied the highest-degree term x by 
sin(lla; 7 ); there the sine factor matters. In our current example, the sine term 
is added instead. 

How about the denominator? Well, the cosine term is much smaller than 
the dominant term 2x 2 . All up, we’ll multiply and divide the numerator by 
3a: 2 and the denominator by 2x 2 : 


.. 3a: 2 + 2a: + 5 + sm(3000a: 9 ) .. 

lim - —^ —— - t— -= lim 

x—oo 2X Z — 1 — COS(22x) cc^oo 


3a: 2 + 2$ + 5 + sin(3000x 9 ) 
3x 2 


(3a; 2 ) 


2a: 2 — 1 — cos(22x) 


2a: 2 


x (2a: 2 ) 


2 5 

-\ - 1 - 

3a: 3x 2 


i(3000a: 9 ) 

3x 2 


2a: 2 


cos(22a;) 

2a: 2 


3a: 2 

2^* 


Now what happens? We certainly know that 2/3x, h/3x 2 , and l/2x 2 go to 0 
in the limit, but how about the sin(3000o; 9 )/3a; 2 and cos(22a:)/2a: 2 terms? If 
you want to give a complete solution, you need to use the sandwich principle 
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wf 'I (once for each term) to show that they both go to 0. I suggest you try it as 

an exercise now. In practice, most mathematicians would automatically write 
down the answer 0, having established the general principle that 

lim sin(anything) = ◦ 

x-^-oo X a 

for any positive exponent a, and similarly when sine is replaced by cosine. In 
any case, the above limit works out to be 

l + 0 + 0 + 0 、 3_3 
~1-0-0 X 2 _ 2 ‘ 

Finally, let’s return to the example 

lim a: sin (—), 

\x J 

which was mentioned at the beginning of this chapter. As we saw then, this 
does belong to the large case even though the limit is taken as a: —> 0, because 
h/x is a large number (positive or negative) when x is near 0. So the best we 
can do is to use the sandwich principle, combined with the fact that the sine 
of any number is between —1 and 1. In particular, we have 

-M sin O 

for any x. Now the temptation is to multiply by x: 

也 sin O 

Unfortunately, this is only true for $ > 0. For example, if a: = —2, then the 
leftmost part of the inequality would be 2 and the rightmost part would be 
—2, which is crazy. So let’s worry about the right-hand limit first: 

lim a: sin (—). 

: \x ) 

Now we can use the above inequalities and note that both —x and 工 go to 0 
as a; — 0+, so the sandwich principle applies and the above limit is 0. As for 
the left-hand limit (as x 0 _ ), now we start off with the same inequality for 
sin(5/a:) and multiply it by x, but this time we have to reverse the inequalities 
since x is negative. In particular, when a; < 0, we have 

- ,>,sing)>,. 

It doesn’t matter much, though ― the outer quantities still go to 0 as a: — 0 一 , 
so the middle quantity also goes to 0. Since the left-hand and right-hand 
limits are both 0, so is the two-sided limit; we have proved that 

lim a: sin (—) = 0. 

: d \x J 

(This example is very similar to the one on page 52.) 











in the above limit, set t = x — 7r/2. Then when x —> 7t/2, you can see that 
t > 0. Also, x = t-\- 7r/2, so we have 


Notice that we still need to know the behavior of cosine near 7t/2 (as you 
can see by setting t near 0 and looking what you’re taking cosine of!); the 
substitution hasn’t changed that fact. Now, this is where you need to know 
the following trig identity from Section 2.4 of Chapter 2: 


(I - a;) = sin(>). 


In our limit, we have cos (号 + 1), so we need to apply the above trig identity 
with x replaced by —t in order for it to be useful. We get 


(I +i) = sin(-i). 


The other thing we need to reme] 


cos H- = sin(—t) = — sin ⑷. 

Now we can put this into the limit and finish the problem. All in all, 


Not so easy, but knowing the trig identities certainly helps in situations like 
these. 

Proof of^n importofit limit 

We’ve been using the following limit over and over again in this chapter, and 
now it’s time to prove it: 

sin(;r) _ , 


The proof has to rely on the geometry of right-angled triangles, since that’s 
where the sine function comes from. Let’s start with the right-hand limit (as 
x 0+). Once we get that, we’ll see that the two-sided limit is pretty easy. 
So, we’ll start off by assuming that x is near 0 but positive. Let’s draw a 
wedge OAB of a circle of radius 1 unit with angle x: 





















Section 7.1.5: Proof of an important limit • 1 39 


the segment AC^). Since \OA\ = 1, we have \AC\ = sin(a;). Also, we have 
tan(a;) = ]§§[，and \OB\ = 1, so \DB\ = tan(x). 

I want to focus attention on three objects. One is the original wedge; we 
already found that the area of this is x/2 square units. Let’s also look at the 
triangles AOAB and AOBD. The base of AOAB is OB, which has length 1 
unit. The height is AC^ which has length sin(a:) units. So the area of AOAB 
is half the base times the height, or sin (: r)/2 square units. As for AOBD, its 
base OB has length 1 unit and its height DB has length tan(a:) units, so the 
area of AOBD is tan(x)/2 square units. The crucial observation is that 

AOAB is contained in the wedge OAB which is contained in AOBD. 

This means that the area of AOAB is less than the area of the wedge OAB, 
which itself is less than the area of AOBD: 


area of AOAB < area of wedge OAB < area of AOBD. 


We know all three of these quantities in terms of the variable x; substituting 
them in, we have 

sin(a:) x tan(a:) 

~~ 2 ~ < 2 < 

Multiplying this by 2, we get a really nice inequality which is worth remem¬ 
bering: 

sin(a:) < x < tan ⑷ for 0 < a: < ^. 

Now we can find our limit. Let’s first take reciprocals of the nice inequality. 
Remember, this forces us to switch the less-than signs to greater-than signs. 
Writing tan(a:) = sin(ar) / cos(x), the reciprocal inequality is 

1 1 cos ⑷ 

sin(a:) x sin(a:) 

Finally, multiply by the positive quantity sin($) to see that 


sin(a:) , 、 

1 > —— > cos(x). 
x 

If it creeps you out to write it backward like this, you can always rewrite it as 
cos (a ： )<^M<l. 


(Remember, this is true for any x between 0 and 7r/2.) Now we use the 
sandwich principle: since cos(O) = 1 and y = cos ⑷ is continuous, we know 
that lim 0+ cos(a:) = 1. Also, lim 0+ (1) = 1; so the quantity sin(x)/;r is squished 
between cos(a:) and 1, both of which tend to 1 as 工 — 0+. By the sandwich 
principle, 

lim ^ = 1 

x^0+ X 

as well. So we’ve got our right-hand limit. 
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We still have to deal with the left-hand limit and show that 
lim ^ = 1. 

工 一 M)- X 

If we can do it, then we will have proved that both the left-hand and right- 
hand limits are 1, so the two-sided limit is also 1 and we’ll be done. 

To prove that the left-hand limit is 1, set t = —x. Then when x is a small 
negative number, i is a small positive number. In math symbols, we can say 
that as a: — 0 一 ， we have i > 0 + . So the above limit can be written as 

lim ^1. 

*—•0+ —t 

Now we know that sin (— 亡 ）=— sin ⑷ (since sine is an odd function), so we 
can simplify the limit down to 

lim ^1= lim 
t-^o+ —t t—0+ t 

We’ve already seen that this limit is 1 (well, with x instead of but so what?), 
so we’re all done. 

Before we move on to differentiating trig functions, I want to consider the 
graph of f(x) = sin(x)/x. The argument for the left-hand limit has in fact 
shown that / is an even function (can you see why?). This means that the 
y-axis acts as a mirror for the graph oiy = f(x). If you look back at page 51, 
you can see that we have already drawn the graph of y = f(x) when a: > 3. 
We didn’t do x < 3 since we didn’t know what happens. Now we know: 
as a; ^ 0, the quantity f(x) = sin(x)/x 1. In fact, we have shown that 
sin(a:)/a: lies between cos(x) and 1. This allows us to extend the graph down 
to a: > 0. Finally we use the evenness of / to give the complete graph of 
y = sin(a:)/a: in all its glory (note the different scales on the x- and y-axes): 



The graphs of the envelope functions y = 1/x and y = —1/x are shown as 
dotted curves. Also, the ^-intercepts are at all the multiples of n except for 
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0. Finally, as you can see, the function isn’t continuous at a; = 0 since it isn’t 
defined there. However, if we define the function g by g{x) = sin(a ;) / x if ar ^ 0 
and 夕 (0) = 1， then we have effectively filled in the open circle at (0,1) in the 
above picture, and the function g is continuous. 


Derivatives: Involving Trig FunaHons 


Now, time to differentiate some functions. Let’s start off by differentiating 
sin(x) with respect to x. To do this, we’re going to use two of the limits from 
Section 7.1.2 above: 


and 


lim 1 —7 ㈨ =0. 
h — ^0 h 


(OK, so I changed x to h, but no matter — the " is a dummy variable anyway 
and could be replaced by any letter at all.) Anyway, with f(x) = sin (: r), let’s 
differentiate: 


fix) = lim /(m/ ㈤ = lim M^ + h)-sm(x) 
h — ^0 h h — h 


Now what? Well, you should remember the formula 


sm(A B) = sin(A) cos(B) 4 - cos ⑷ sin(J5); 


if not, you’d better look at Chapter 2 again. Anyway, we want to replace A 
by x and B by h, so we have 


sin(a: -\-h) = sin(a:) cos(/i) + cos(a:) sm(h). 


Inserting this in the above limit, we get 

"/ 、 sin(a:) cos(/i) + cos(x) sin(/i) — sin(a:) 

} [X) = h • 

All that’s left is to group the terms a little differently and do a bit of factoring; 
we get 

£l( 、 1# _ sin(a:)(cos(/i) — 1) + cos(a:) sin(/i) 

= 匕 h 

=( Sin ㈤+ cos{ X ) ( 亨 )). 

Notice that we separated as much : r-stuff as we could from "-stuff. Now we 
actually have to take the limit as ft — 0 (not as a: — 0!). Using the two limits 
from the beginning of this section, we get 

f(x) = sin ⑷ x 0 + cos(:r) x 1 = cos(a:). 

That is, the derivative of f(x) = sin(o:) is f f (x) = cos(a:), or in other words, 


_d_ 

dx 


sin(a:) = cos(a:). 
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Now you should try to repeat the argument but this time with f(x) = cos(x). 
You just need the identity 


cos (A B) = cos ⑷ cos(B) — sin(A) sin(B) 

from Chapter 2. It’s a really good exercise, so try to do it now. If you’ve done 
it correctly, you should see that 


— cos(a:) = — sin(a:). 

Anyway, it’s a piece of cake to get the derivatives of the other trig functions 
now; you don’t need to use any limits. You can just use the quotient rule and 
the chain rule. Let’s start with the derivative of y = tan(x). We can write 
tan(a:) as sin(x) / cos(x), so if we set u = sin(a:) and v = cos(a:), then y = u/v. 
We just worked out that du/dx = cos(a:) and dv/dx = — sin(x). Using the 
quotient rule, we get 

du dv 

dy v ▲- u ~^ cos (a:) (cos (a:)) — sin(x)( — sin($)) 

dx v 2 cos 2 (a:) 

The numerator of this last fraction is just cos 2 (a:) + sin 2 (x), which is always 
equal to 1; so the derivative is just 

l = ^) =sec2(a;) - 

We’ve just shown that 

去 tan(a:) = sec 2 (x). 

Now let’s calculate the derivative of y = sec(x). Here we are able to write 
y = 1 / cos(a:), so you might think that the quotient rule is best. Indeed, you 
can do it by using the quotient rule, but the chain rule is nicer, liu = cos(x), 
then y = 1/u. We can differentiate both of these things: dy/du = —1/u 2 , and 
du/dx = — sin ⑷. By the chain rule, 


dy _ dy du 
dx du dx 


= (-^ 2 ) (~Mx)) = 


sin ⑷ 
cos 2 (x) 


where we had to replace u by cos (a:) in the last step. Actually, you can tidy 
up the answer as follows: 

sin ⑷― 
cos 2 (x) 

so we’ve shown that _ 

^ sec( x) = sec(x) tan(a:). I 

As for y = csc(x), that should be written as 1/ sin(a:). Once again, it’s 
best to use the chain rule, letting u = sin(a;) and writing y = 1/u. But I 


4 鵠=-(扣⑷， 














7,2,1 Examples of differentiating trig functions 

Now that you have some more functions to differentiate, you’d better make 
sure you still know how to use the product rule, the quotient rule, and the 
chain rule. For example, how would you find the following derivatives: 


丟 Or 2 sin ⑷ ) ，蠢 ( 学 ) and 丟 (c” 










Let’s take them one at a time. If y = x 2 sin(a:), then we can write y 
where u = x 2 and v = sm(x). Now we just need to set up our table: 


Using the product rule (see Section 6.2.3 in the previous chapter), we get 
dy du dv . . x x , 

d^ =v d^ +u d^ =smix) - {2x)+x 呼 ) 

This would normally be written as 2x sin(a:) + x 2 cos(x). Anyway, let’s do the 
second example, li y = sec(x)/x 5 1 this time we set u = sec(x) and ■?; = x 5 so 
that y = u/v. Our table looks like this: 

u = sec (a:) v = x 5 

du , 、 . ， 、 dv ^ A 


Whipping out the quotient rule leads to 
du dv 

dy_ = V dx~ U dx = x5 sec ( x ) tan(ap — i 
dx v 2 ( 工 5 ) 2 


:⑷(: r tan (: r) — 5) 


Note that we canceled out a factor of x 4 at the end. Now, moving on to the 
third example, set y = cot (a: 3 ). Here we are dealing with a composition of 
two functions, so we’d better use the chain rule. The first thing that happens 
to x is that it gets cubed, so let u = x s . Then y = cot(w). Our table is 


By the chain rule, we have 

^ = ^^ = - csc22 
dx du dx 

We can’t just leave that u term lying around ― we need to replace it by a: 3 . 
Altogether, then, our derivative is —3a; 2 esc 2 (x s ). 

Before we move on, I want to show you a neat trick. Suppose you have 
y = sin(8$) and you want to find dy/dx. You could do it by using the chain 
rule, setting u = 8a:, so that y = sin(w). It’s an easy exercise (try it!) to show 
that dy/dx = 8cos(8x). Of course, there’s nothing special about the number 
8; it could have been anything. So the general rule is that 

= a cos(ax) 



for any constant a. Basically, if x is replaced by ax, then there is an 
extra factor of a out front when you differentiate. This also works 
for the other trig functions. For example, the derivative with respect to x 
of tan(a:) is sec 2 (a:), so the derivative of tan(2a:) is 2 sec 2 (2$). In the same 
way, the derivative of esc (a:) is — csc(a:) cot (x)，so the derivative of csc(19x) is 
—19csc(19a:) cot(19a;). This saves you the trouble of using the chain rule in 
this easy case. 
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7.2.2 Simple harmonic motion 

One place where trig functions appear naturally is in describing the motion 
of a weight on a spring bouncing up and down. It turns out that if x is the 
position of a weight on a spring at time t, taking upward as positive, then a 
possible equation for x is something like x = 3sin(4t). The numbers 3 and 4 
might change, and the “sin” might be a “cos,” but that’s the basic idea. The 
equation is reasonable — after all, cosine keeps bouncing back and forth, and 
so does the weight. This sort of motion is called simple harmonic motion. 

So, if a: = 3sin(4t) is the displacement of the weight from its starting 
point, what are the velocity and the acceleration of the weight at time tl All 
we have to do is differentiate. We know that v = dx/dt, so we just have to 
differentiate 3 sin(4i) with respect to t. We could use the chain rule, but it’s 
simpler to use the observation at the end of the previous section. Indeed, to 
differentiate sin(4 亡 ) with respect to we just observe that the derivative of 
sin ( 亡 ） would be cos(^), so the derivative of sin(4t) is 4cos(4i). (Don’t forget 
that factor of 4 out front!) All in all, we have 


v = ^(3sin(4^)) = 3 x 4cos(4^) = 12cos(4^). 

Now we can repeat the exercise for acceleration, which is given by dv/dt^ using 
the same technique: 


dv 

dt 


d_ 

dt 


(12cos(4t)) = —12 x 4sin(4t) = —48sin(4 亡 ). 


Notice that the acceleration — which of course is the second derivative of the 
displacement 一 is basically the same as the displacement itself, except that 
there’s a minus out front and the coefficient is different (48 instead of 3). 
The minus means that the acceleration is in the opposite direction from the 
displacement. In fact, we have shown that 

a = —16a:, 


since 48 = 3 x 16. Now let’s interpret this equation by examining the motion 
of the weight a little more closely. 

The position x is given by x = 3sin(4(), with the understanding that the 
rest position of the weight is at a: = 0. Now, if we multiply the inequality 
-1 < sin(4t) < 1 (which is true for all t) by 3, we get —3 < 3 sin(4^) < 3. That 
is, —3 < a: < 3. So we can see that x is oscillating between —3 and 3. When 
x is positive, the weight is above its rest position; then a is negative, which 
is good: the acceleration is downward, as it should be. As x gets bigger and 
bigger, the spring compresses even more, causing the weight to experience a 
greater force and acceleration downward. Eventually the weight starts going 
down, and after a little while x becomes negative. Then the weight is below 
its rest position, so the spring is expanded and tries to pull the weight back 
up. Indeed, when x is negative, a is positive, so the force is upward. The 
following picture shows what’s going on: 
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iT.——!.....—T T 

— $ = 0 x > 0 a: = 0 x < 0 x = 0 
a = 0 a <0 a = 0 a> 0 a = 0 


When the weight is at the top of its motion, the velocity is 0. Since we have 
v = — 12cos(4^), this occurs whenever At is an odd multiple of 7r/2, that 
is, when t = (2n + 1) 丌 /8 for some integer n. Now, enough about simple 
harmonic motion — let’s just look at one more example of trig differentiation 
before moving on to implicit differentiation in the next chapter. 

7.2.3 Aouriousfunction 

Consider the function / given by 


f(x) = x 2 sin 



What is its derivative? We’d better not worry about x = 0, since / isn’t 
defined there, but we’ll be fine for other values of x. Set y = f{x)\ then y 
is the product oi u = x 2 and v = sin(l/a;). It’s easy to differentiate u with 
respect to x (the answer is just 2x), but r is a little harder. The best bet is 
to set w = 1/x, so that v = sin(w). Then we can draw up our standard table: 

1 

w =— 
x 

dw _ 1 

dx x 2 

dv dv dw f 、 （ 1 \ cos(l/x) 

Tx = ^ = cos{w) 


v = sin(w) 

4— = cos(w) 
dw 

Now we can use the chain rule: 


Now that we have du/dx and dv I dx, we can finally use the product rule on 
y = uv: 


I = ^ +U £ =Sin Q) ㈣ 知 2 (-^M)= 2 a ： sin(I)-cos(i), 



and we’re done. 

It turns out that the function / is pretty curious. Let’s see why. (If you 
don’t feel like it, I guess you can go on to the next chapter and come back to 
it later.) Anyway, to investigate further, we’ll need the following three limits: 


lim a: 2 sin ^ =0, lim xsin ^ =0, and 


lim cos ( - ] DNE. 
x^0+ \X J 
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You can do the first two of these limits using the sandwich principle and the 
fact that sine or cosine of anything (even 1/x) is between —1 and 1. The third 
limit is a little trickier, but we did it for sin(l/a;) in Section 3.3 of Chapter 3, 
and changing sin to cos doesn’t make any difference. The issue (as you may 
recall) is that the oscillations of cos(l/a;) between —1 and 1 become more and 
more wild as a: — 0+， so the limit doesn’t exist. 

Anyway, the first limit says thatjim 0 /(a:) = 0, even though /(0) is un¬ 
defined. This means that we can extend / to be continuous by filling in the 
point /(0) = 0. So we’ll throw away the old / from above and define a new 
one by the following formula: 


/㈤ = 



if a: 7^ 0, 
if x = 0. 


We have just shown that this new, improved / is continuous everywhere. We 
have already found its derivative when x ^ 0: 


/' ⑻= 2 恤⑸ -cos ⑸. 


So, what’s the derivative of / at a: = 0? None of our fancy-shmancy rules will 
help here: we have to use the formula for the derivative: 


/' ⑼ 


- m + h)- f{ 0) = Hm ^sind/^-O = lim/igin 

h — ^0 h h — ^0 h h — ^0 


G) 


Now this last limit is the middle of our three limits from above (with h replac¬ 
ing x), and it exists and equals 0. This means that / is actually differentiable 
at x = 0, and in fact /’(O) = 0. Can you tell that from the graph of y = f(x)? 
Here’s what it looks like for —0.1 < x < 0.1, along with the envelope functions 
y = x 2 and y = —x 2 : 




=x 2 sin 
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It looks pretty wobbly at a: = 0 to me, so it’s not clear at all that the derivative 
should even exist there — but we’ve just shown that it does! This leads to the 
following question: what is 

Hm /’ ⑻？ 

cc^0+ 

Since we know that f f (0) = 0, you might think that the above limit is just 0. 
Let’s check it out by using the above formula for f f (x) when x ^ 0: 

W ㈤=，+ (2xsin(i)-cos(l)). 

We have two terms to deal with here. The first term (2a:sin(l/:c)) goes to 0 
in the limit, since it’s just twice the middle of our three limits from above. 
On the other hand, the second term (cos(l/a:)) has no limit as a: — 0; this is 
exactly what the third limit from above says. The conclusion is that 
doesn’t exist. By symmetry (check that / is an odd function), neither does 

Now let’s summarize what we’ve found. Our function / is continuous 
everywhere and also differentiable everywhere, even at a: = 0. Indeed, at 
x = 0, the derivative /’(0) equals 0, but near 0, the derivative f f (x) oscillates 
wildly: |ig_ 0 /’(;r) doesn’t exist even though /’(0) does. In particular, we have 
now shown that the derivative function f is not itself a continuous function. 
So, there are functions out there which are differentiable, yet their derivatives 
aren’t continuous. That’s pretty darned curious! 






CHAPTER 8 


Implicit Differentiation and Related Rates 


Let’s take a break from trying to work out how to differentiate everything in 
sight. It’s time to look at implicit differentiation, which is a nice generalization 
of regular differentiation. We’ll then see how to use this technique to solve 
word problems involving changing quantities. Knowing how fast one quantity 
is changing allows us to find how fast a different, but related, quantity is 
changing too. Anyway, the summary for this chapter is the same as the title: 

• implicit differentiation; and 

• related rates. 


8.1 1‘mpliciflpiff© 鮮 nfiation 


Consider the following two derivatives: 

and 


■(y 2 )- 


The first is just 2x, as we’ve seen. So isn’t the second one 2y? That would be 
the answer if the differentiation were with respect to y, but it isn’t: the dx in 
the denominator tells us that the differentiation is with respect to x. How do 
we unravel this? 

The best way is to say to yourself that the first of the derivatives above is 
asking how much the quantity x 2 changes when we change x a little bit. As 
we saw in Section 5.2.7 of Chapter 5, if we do change a: by a little bit, then 
x 2 changes by approximately 2x times as much. 

On the other hand, if you change a: by a little bit, what does that do to 
y 2 ? This is what we need to know in order to find the second of our above 
derivatives, d(y 2 )/dx. Think of it this way: if you change x, then y will change 
a little bit; this change in y will cause y 2 to change. (All this is true only if y 
depends on x, of course — if not, then when you change x, nothing at all will 
happen to y.) 

If you think that it sounds as if I’m hinting at the chain rule here, you’re 
quite right. Here’s how it actually works. Let u = y 2 , so that du/dy = 2y. 
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By the chain rule, 


d . ox du du dy dy 

Tx {y) = T x = T y -T x =2y Tx- 

So if you change a: by a little bit, then y 2 changes by 2y(dy/dx) times as 
much. Now you might complain that the answer contains dy/dx in it, but 
what did you expect? If you want to know how the quantity y 2 changes when 
you change x a little, then first you need to know something about how y 
changes! (Again, if y doesn’t depend on x, then dy/dx equals 0 for all x, so 
d(y 2 )/dx is also 0 for all x. That is, y 2 doesn’t depend on x either.) 



Techniques and examples 

Now it’s time to get practical. Consider the following equation: 

x 2 -\-y 2 = 4. 


The quantity y isn’t a function oix. In fact, when —2 < x < 2, there are two 
values of y satisfying this equation. On the other hand, the graph of the above 
relation is the circle of radius 2 units centered at the origin. This circle has 
nice tangents everywhere, and we should be able to find their slopes without 


having to write y = ±V4 — x 2 and differentiating. In fact, all we have to do 
is whack a d/dx in front of both sides: 


As we know, the left-hand side can be split into two pieces without any prob¬ 
lem. In fact, normally one would just automatically start by writing 


_d_ 

dx 


W) + 



_d_ 

dx 


⑷. 


To simplify this, note that we have already identified the two quantities on the 
left-hand side in the previous section, and the right-hand side is 0 because 4 
is constant. Be careful not to write 4 instead 一 this is a very common mistake! 
Anyway, here’s what we get: 


2x + = 0. 

Dividing by 2 and rearranging leads to 

dy x 

dx y 

This formula says that at the point (x,y) on the circle, the slope of the tangent 
is —x/y. If the point isn’t on the circle, then the formula doesn’t say anything 
(at least as far as we’re concerned). Now, let’s use the formula to find the 
equation of the tangent to the circle at the point (1, y/S). This point certainly 
does lie on the circle, since x 2 y 2 = l 2 (v^) 2 = 4. By the above formula, 
the slope is given by dy/dx = —1/\/3. So, the tangent line has slope —1/\/3 
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and goes through (1, v^3). Using the point-slope formula, we see that the 
equation of the line is 

y - ^ = -1). 

This can be simplified slightly to y = (4 — x)/y/3, if you like. 

Here’s another example: if 

5 sin ⑷ + 3 sec(y) = y — x 2 + 3, 

what is the equation of the tangent at the origin? Unlike the previous example, 
it’s impossible to solve this equation for y (or x). So we have to use implicit 
differentiation. Let’s first verify that the origin actually lies on the curve. 
Plugging in a; = 0 and y = 0 gives 5 sin(O) + 3sec(0) on the left-hand side, 
which is just 3 (remember that sec(O) = 1/ cos(O) 二 1). The right-hand side is 
also 3, so the origin is on the curve. Now let’s differentiate the above equation, 
splitting it up as we do so: 

丟 (5sin ⑷) + ^(3sec(y))= 尝 - ^(x 2 ) + 丟 (3)_ 

The only one of these quantities that’s hard to simplify is the second one 
on the left-hand side. It’s not too bad, though: let u = 3sec(y). Then 
du/dy = 3 sec(y) tan(y), so by the chain rule, we have 

去 (3sec ⑼) = 芸 n = 3sec ⑼ tan ⑼ 

So we can return to the previous equation and differentiate everything, getting 

5 cos(rc) + 3 sec(y) tan(y) ^ — 2x. 

Note that when you differentiate the constant 3, you get 0. In any case, we 
could solve for dy/dx here: just throw all the stuff involving dy/dx on one 
side and everything else on the other side: 

^ — 3 sec(y) tan(y) =2x 5 cosfa:). 

ax ax 

Now factor — 

— 3sec(t/) tan(y)) = 2x + 5 cosfa:) 
ax 

— and then divide to get 

dy 2a: + 5 cos ⑷ 
dx 1 — 3 sec(y) tan(y) 

Finally, plug in a: = 0 and y = 0 to see that 

dy 二 2(0) + 5 cc^O) = 2(0) + 5(1) = 
di = l-3sec(0) tan(0) = 1-2(1)(0) = * 
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Since the tangent line has slope 5 and goes through the origin，its equation is 
just y = 5x, and we’re done. But do you see how we might have saved a little 
effort? Go back to the equation 


5 cos ⑷ + 3 sec(y) tan(y) 尝 =^- — 2x 

from above. We manipulated this to find the general expression for dy/dx^ 
but actually we only care about what happens at the origin. So we could have 
saved a little time by plugging x = 0 and y = 0 into the above equation. We 
would have gotten 

5cos(0) + 3sec(0)tan ⑼# = ^ - 2(0). 



This easily reduces to dy/dx = 5. So a good rule of thumb is that if you 

only need the derivative at a certain point, substitute before rear¬ 
ranging —— it often saves time. 

So far, we’ve only used the chain rule. Sometimes you might need to use 
the product rule or the quotient rule. For example, if 

ycot(a:) = 3 csc(y) + x 7 , 


then you’ll need the product rule and the chain rule to find dy/dx. Indeed, if 
we differentiate, we get 

A. { ycot(x)) = ^(3csc(y^^). 


The left-hand side is the product of y and cot (a:). We should give it a name —— 
I’ll call it s, so that s = ycot{x). If we also set v = cot (a:), then s = yv, and 
we can use the product rule to differentiate s with respect to x: 


ds dy dv . . dy . 2 , 、、 

T x = v Tx +y T x =cot{x) d^ +v{ - csc (x)) - 


(Remember that the derivative of cot (a;) with respect to a: is — esc 2 (a;).) Now, 
let’s worry about the right-hand side of our original equation from above. For 
the first term, 3csc(y), we’ll use the chain rule. Let’s call the term u, so 
u = 3csc(y). We can see that du/dy = — 3csc(y) cot(y), so by the chain rule 
we have 

盖聲一， t x 

Finally, the derivative of the last term, a: 7 , with respect to x is just 7x 6 . 
Putting it all together, we see that when we differentiate both sides of our 
original equation 

ycot(x) = 3 csc(y) + x 7 

with respect to x, we get 


)t(x) 穿 - ycsc 2 (x) = —3 csc(t/) cot(y ) 穿 + 7a; 6 . 
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Let’s throw everything involving dy/dx on the left and everything else on the 
right: 

cot(x)^- H- 3 csc(y) cot(y)^- — y esc 2 (a:) + 7a: 6 . 

Now factor the left-hand side and divide to solve for dy/dx: 

dy — ycsc 2 (x) H- 7x 6 
dx cot(x) + 3 csc(y) cot(y) ’ 

and we’re done. 

Finally, consider the equation 

x — y cos (-^) =?r + l. 

What is the equation of the tangent to the point (1 ，丌 ） on the curve? I leave 
it to you to substitute x = 1 and y = n, and make sure that the left- and 
right-hand sides agree, so that the point is indeed on the curve. Now we have 
to differentiate. We get 

( ycos ®) = i (7r+1) - 

The first term is easy: it’s just 1. Also, the right-hand side is 0, since 7r + 1 
is constant. This leaves us with an awful mess in the middle. Suppose we set 

s=ycos {^) - 


Then s is the product of y and v, where v = cos(y/x A ). By the product rule, 


we have 


ds dy 

dx dx 


+ 2/： 


There’s no escape: we are going to have to differentiate v. Suppose we set 
t = y/x A . Then v = cos ⑺， so dv/dt = — sin ⑷， and the chain rule tells us 


that 


dv 

dx 


dv 

dt 


dt 

dx 


=-sin ⑷多 =-sin 



dt 

dx 


We’re not out of the woods yet, though~~we need to find dt/dx. Now t = y/x 4 , 
so set U = y and V = x 4 . (I already used a little v, so I’ll use the capital 
letter here.) The quotient rule says that 


dt = V !- U S = xA %r y t {xi) = xi %r Ax3y = x %r Ay 

dx V 2 (x 4 ) 2 x 8 x 5 • 

Now we just need to unwind everything. Working backward, we can now 
finish the calculation of dv/dx: 


dv . ( y \ dt 


- sin &) 


x^--4y 
dx _ 
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This in turn allows us to find ds/dx: 



Finally, we can go back to our original differentiated equation 

l (a:) -l( 2/cos (^)) = i (7r+1) 

from above and simplify this down to 



Don’t bother solving for dy/dx\ We only need to know what happens when 
x = 1 and y = 7T. So plug those in. Noting that cos(7r) = —1 and sin(7r) = 0, 
you should check that the whole darn thing simplifies to 

1 - (—1) 穿 + 7r x 0 x irrelevant junk = 0, 

or just dy/dx = —1. Our tangent line therefore has slope —1 and goes through 
(1,7r), so its equation is y—7r = —(x— 1); you can rewrite this as y = —a:+7r+l 
if you like. 

We still have to look at how to find the second derivative using implicit 
differentiation. Just before we do that, here’s a brief summary of the above 
methods: 

• in your original equation, differentiate everything and simplify using the 
chain, product, and quotient rules; 

• if you want to find dy/dx, rearrange and divide to solve for dy/dx; but 

• if instead you want to find the slope or equation of the tangent at a 
particular point on the curve, first substitute the known values of x and 
y, then rearrange to find dy/dx. Then use the point-slope formula to 
find the equation of the tangent, if needed. 


8.1.2 Finding the second privative Implicitly 



It’s also possible to differentiate twice to get the second derivative. For ex¬ 
ample, if 

x 2 

2y + sin(y) = — + 1, 

7T 


then what is the value of d 2 y/dx 2 at the point (7r, 7 t/ 2) on the curve? Once 


again, you should verify that the point does lie on the curve by plugging in 
these values of x and y and seeing that the equation checks out. Now, if you 
want to differentiate twice, you have to start by differentiating once! You 
should get 


2 X = 2 上， 

ax ax 7r 
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having used the chain rule to tackle the sm(y) term. Now we need to differ¬ 
entiate again. Do not substitute first! In order to differentiate, we need to see 
what happens when x and y are varying. This can’t happen if we fix them 
at certain values (like 7r and 7r/2). Instead, differentiate the above equation 
with respect to x: 


( 2 £) + l ( cos(y) £)(?) 


The right-hand side is just 2/ 丌 , and the first term on the left-hand side is just 
2{d 2 y/dx 2 ). The tricky bit is the second term on the left. We’ll need to use 
the product rule: set s = cos(y)(dy/dx), and also u = cos(y) and v = dy/dx, 
so that s = uv. By the product rule, 

ds du dv dy du , 、 d f dy\ dy du , N d 2 y 

d^ =V d^ +U d^ = d^ -石 + ⑽⑼石 [d^) = d^' d^ +COS{y) dx^- 

We still need to find du/dx, where u = cos(y). This is just the chain rule once 
again: 

du du dy dy 

▲ H _ sin ⑼石. 

Putting it all together, we see that 

£= 尝 •芸 + cos ⑹ U sin ⑼盏) + cos(y) 0 
=- sin(?/) (!) + c —) 尝. 

Beware: the quantities 


⑵ 


and 


d 2 y 

dx 2 


are completely different! The left one is the square of the first derivative, while 
the right one is the second derivative. Anyway, let’s put everything together. 
Starting from 


( 2 芸 ) 


+ i ( cos(y) £) = l(f) 


can now write this as 


20 - 血 ⑼ (I) +COs(y) S = f 


Phew. That was exhausting. We’re not done yet, though: we still need to 
find d 2 y/dx 2 when x = n and y = n/2. So plug that in to the above equation: 
you get 
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lis simplifies down to 


2 fy_ (dy\ = 2 

dx 2 \dx J 7T 

problem is, we still need dy/dx\ No problem: in our equation 


way above, put x = tt and y = 7r/2 (I didn’t let you do this before!) and 


get 




dy/dx = 1. Put that into our second derivative equation and we get 



lis means that 



len x = 7T and y = 7r/2, so we 5 re finally done! 


?la%d Rates 

msider two quantities — they can measure anything you like — that are re- 
;ed to each other. If you know one, you can find the other. For example, if 
u keep your eyes on an airplane that passes over your head, then the angle 
at your line of sight makes with the ground depends on the position of the 
me. In this case, the two quantities are the position of the plane and the 
gle I just described. 

Of course, as one of the two quantities changes, so does the other. Suppose 
at we know how fast one of the quantities is changing. Then how fast is 
5 other one changing? That is exactly what we mean by the term related 
tes. You see, a rate of change is the speed at which a quantity is changing 
er time. We have two quantities which are related to each other, and we 
mt to know how their rates of change are related to each other. (By the 
iy, sometimes we’ll abbreviate and say “rate” instead of “rate of change .’’） 
The above definition of a rate of change was a little sketchy. If you want 
know how fast something is changing over time, you simply have to dif- 
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If you differentiate both sides implicitly with respect to you’ll find that 
the rates just pop out, giving you a new equation. The same thing works if 
you are dealing with three or more quantities which are related (for example, 
the length, width, and area of a rectangle). Just differentiate implicitly with 
respect to t and youll relate the rates of change. 

So, let’s look at a general overview of how to solve problems involving 
related rates. Then we’ll use it to solve a bunch of examples. 

1. Read the question. Identify all the quantities and note which one you 
need to find the rate of. Draw a picture if you need to! 

2. Write down an equation (sometimes you need more than one) that relates 
all the quantities. To do this, you may need to do some geometry, 
possibly involving similar triangles. If you have more than one equation, 
try to solve them simultaneously to eliminate unnecessary variables. 

3. Differentiate your remaining equation(s) implicitly with respect to time 
t. That is, whack both sides of each equation with a 羞 . You end up 
with one or more equations relating the rates of change. 

4. Finally, substitute values for everything you know into all the equations 
you have. Solve the equations simultaneously to find the rate you need. 

The only difference between these types of problems and the word problems 
you’ve already seen is that step 3 was absent. Here, it makes all the dif¬ 
ference. Just one more thing before we look at examples: it’s vital that you 
substitute values at the end, after differentiating! That is, don’t switch 
steps 3 and 4. If you substitute values first, denying the quantities the ability 
to change, then your rates will all be 0. That’s what you get for freezing 
everything in place •… 


8.2.1 A simple example 



Here’s a relatively simple example to illustrate the above method. Suppose 
that a perfectly spherical balloon is being inflated by a pump. Air is entering 
the balloon at the constant rate of 12tt cubic inches per second. At what rate 
does the radius of the balloon change at the instant when the radius itself is 
2 inches? Also, at what rate does the radius change when the volume is 36 丌 
cubic inches? 


OK, let’s write down our quantities (step 1). These are the volume and 
the radius of the balloon. Let’s call the volume V (in cubic inches) and the 
radius r (in inches). We need to find the rate of change of the radius r. Now, 
we need an equation relating V and r (step 2). Here’s where the geometry 
comes in. Since the balloon is a sphere, we know that 



This relates the quantities. Now we need to relate the rates (step 3). Differ¬ 
entiate both sides implicitly with respect to t: 


d_ 

dt 




d_ 

dt 
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The left-hand side is just dV/dt\ to handle the right-hand side, let 5 = r 3 , so 
ds/dr = 3r 2 . By the chain rule, 


ds _ ds dr _ 加 2 dr 
dt dr dt dt 

Now we can put this in our above equation and get 


dV_ 

dt 



= 47rr 2 


dr 
dt * 


So we have an equation relating the rate of V with the rate of r. Finally, we’re 
ready to substitute (step 4). In both parts of the question, the rate of change 
of volume is 12tt cubic inches per second. In symbols, we have dV/dt = I2n. 
Plugging this into the above equation, we get 

127r = 47rr 2 -. 

at 


Rearranging leads to 

dr _ 3 
dt r 2 • 

Great — that means that if we know the radius r, then we can find the rate at 
which the radius is changing, which of course is dr/dt. Notice that the rate 
of change of the radius is itself a changing quantity: it depends on the radius. 
You’ve probably noticed that when you blow up a balloon, it grows in size (or 
radius) faster at the beginning, and then starts to slow down, even though 
you’re blowing the same amount of air into the balloon all the time. This is 
consistent with the above formula for dr/dt, which is decreasing in r. 

Armed with the formula, we can quickly do both parts of the question. In 
the first part, we know that the radius is 2 inches, so set r = 2 in our formula 
from above: 

dr _ 3 3 

di = ¥ = A- 

So the answer is But | what? It’s important to write a sentence summariz¬ 
ing the situation, as well as including the units of measurement. In this case, 
we’d say that when the radius is 2 inches, the rate of change of the radius is 
I inches per second. 

Now, for the second part of the question, we know that the volume is 367T 
cubic inches. That means that V = 367T. The problem is that we need to 
know what r is in order to find dr I dt. Now we need to go back to the equation 
relating V and r, which was V = |7rr 3 . If you put V = and solve for 
r, you should be able to see that r = 3 inches. Finally, substituting into the 
equation for dr/dt gives 

dr _ 3 _ 3 _ 1 
di = ^ = ¥ = 3 - 

So when the volume is 36 丌 cubic inches, the rate of change of the radius is | 
inches per second. 
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8 . 2.2 

◎ 


A slightly harder example 

Let’s look at another relatively straightforward example, this time involving 
three quantities. Suppose there are two cars, A and B. Car A is driving on a 
road heading directly north away from your house, and car B is driving on a 
different road heading directly west toward your house. Car A travels at 55 
miles per hour and car B travels at 45 miles per hour. At what rate is the 
distance between the cars changing when A is 21 miles north of your house 
and car B is 28 miles east of your house? 

To answer this question, we’d better draw a picture (step 1). Draw your 
house H and the cars A and B. Let the distance between H and A be given 
by a; let the distance between H and B be called 6; and let the distance 
between the cars be called c. The diagram looks like this: 



Note that it would be wrong to mark in 21 instead of a or 28 instead of b. 
You need to see what happens as a and b change, not when they are fixed at 
a certain number, so they need to have the flexibility of being variable. Also 
note that c is the quantity we need the rate of, since it’s the distance between 
the cars. 

Time for step 2. The equation relating a, 6, and c is nothing other than 
Pythagoras, Theorem: 

a 2 + 6 2 = c 2 . 


Moving on to step 3, we differentiate implicitly with respect to time t. Make 
sure you agree that we get 


2a 字 

dt 


+ 26- 


dc 

= 2c jf 


Now, we know that car A is moving at 55 miles an hour away from your 
house. This means that the distance a is increasing by 55 miles per hour, so 
da/dt = 55. As for B, it is moving at 45 miles an hour toward your house. 
This means that b is decreasing by 45 miles an hour, so db/dt = —45. You 
need that negative sign in there! Otherwise you’ll screw the whole thing up. 
Plugging these values in to the above equation leads to 


2a(55)+ 26(-45) = 2c—, 
which can be simplified to 


c-=55a-456. 
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Finally, we can see what happens at the instant of time we’re interested in. 
This is when a = 21 and b = 28. At that instant, we know that c 2 = 21 2 +28 2 , 
which works out to be c = 士 35. Since c is positive (it’s the distance between 
the two cars!), we have c = 35. Put those numbers into the above equation 
and you get 

(35)_ = 55(21)-45(28). 

You can compute this easily by canceling a factor of 5 and a factor of 7 from 
both sides. The end result is that dc/dt = —3. This means that the distance 
between the cars is decreasing at a rate of 3 miles per hour (at the moment 
in time we are considering). 

That’s the answer we need. Notice that the cars are actually getting closer 
together at the moment of time we’re considering, even though A is moving 
away from the house faster than is coming toward it. If we wait a little 
bit, car A will be farther away from the house and car B will be closer; 
by staring at the equation for dc/dt, you might convince yourself that this 
quantity eventually becomes positive (although this observation isn’t required 
to complete the question). 


8,2.3 

◎ 


A much harder example 

Here’s a tougher example involving similar triangles: suppose there’s a freakin’ 
huge water tank in the shape of a cone (with the point at the bottom). The 
height of the cone is twice the radius of the cone. Water is being pumped into 
the tank at the rate of 8 丌 cubic feet per second. At what rate is the water 
level changing when the volume of water in the tank is 18 丌 cubic feet? 

There’s a second part as well: assume that the tank develops a little hole 
at the bottom that causes water to flow out at a rate of one cubic foot per 
second for every cubic foot of water in the tank. I want to know the same 
thing as before: at what rate is the water level changing when the volume of 
water in the tank is 187r cubic feet, but now with the leak in the tank? 

Let’s start with the first part. Here’s a diagram of the situation: 



I 


O 
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We have marked some quantities on the diagram. The height of the tank is H 
and its radius is R. The height of the water level is h and the radius of the top 
of the water surface is r. All these quantities are measured in feet. Let’s also 
let v be the volume of water in the tank, measured in cubic feet. (You could 
let V be the volume of the whole tank, but we’ll never need that quantity 
since the tank will never be full — it’s that huge!). Anyway, that takes care of 
step 1. 

For step 2, we have to start relating some of these quantities. We are 
given that the tank’s height is twice the radius, so we have H = 2R. We’re 
more interested in relating h and r, though. There are some similar triangles 
in the diagram: in fact, AABO is similar to ACDO, so H/R = h/r. Since 
H = 2R, we have 2R/R = h/r, which means that h = 2r. So the water is 
like a mini-copy of the whole tank. Anyway, we still need to find the volume 
of water in the tank in terms of h and r. The volume of a cone of height h 
units and radius r units is given by v = ^7rr 2 h cubic units. It would be nice 
to eliminate one of h and r at this point; since we’re more interested in the 
water level h than the radius r (read the question and see why!), it makes 
sense to eliminate r. Using the equation r = /i/2, we have 


v = 


r r2fl= r 



SO V 


7rh s 

~ 12 ' 


Now, for step 3, let’s differentiate this with respect to time t. By the chain 
rule, 


dv _ it _ wh 2 dh 

di = T2 x3h di = ~di^ 


dv _ 7rh 2 dh 

dt 4 dt 


Great 一 now for step 4, substitute in everything we know into the two equa¬ 
tions above. We know that dv/dt = 8 丌 and we’re interested in what happens 
when v = 187r. Substituting, we get 


nh s 


and 


nh 2 dh 


12 — ~ ~ ~~dt' 

The first equation tells us that h s = 18 x 12 = 216, so /i = 6. That is, when 


the water volume is 18 丌 cubic feet, the water level is at 6 feet. Putting that 
into the second equation, we get 


87T = 



2 dh 


which means that dh/dt = 8/9. That is, the water height is increasing at a 
rate of 8/9 feet per second at the moment we care about (when the volume is 
18-7T cubic feet). 

The second part is almost the same. In fact, the only difference occurs at 
step 4. We still want to substitute in r = 18-7T, which will mean that h = 6 
once again. On the other hand, it’s wrong to put dv/dt = 8 冗 , since that 
doesn’t take into account the leak. We know that 8 丌 cubic feet of water is 
entering into the tank per second, but one cubic foot is leaving per second for 
every cubic foot of water in the tank. Since there are v cubic feet of water in 
the tank (by definition!)，the rate of outflow from the leak is v cubic feet per 
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second. So the rate of inflow is Sn and the rate of outflow is v (both in cubic 
feet per second), which means that 

dv 0 

— = 87V — V. 

at 

Now, when v = 18丌， we have dv/dt = Sn — 18丌 =— 10丌. So we need to 
substitute dv/dt = —lOn and h = 6 into our previous equation 

dv — irh 2 dh 
dt 4 dt * 

The answer works out to be dh/dt = —10/9, which means that the water 
level in the tank is falling at a rate of 10/9 feet per second at the time we’re 
considering. Even though we’re pumping water in, the leak is letting even 
more water out and so the level is falling. 


8.2.4 

◎ 


A really hard exQrnple 

Here’s one more problem. Now that you have seen a number of related rate 
problems, perhaps you should try to solve it before reading the solution. 

Suppose that a plane is flying eastward directly away from you at a height 
of 2000 feet above your head. The plane moves at a constant speed of 500 
feet per second. Meanwhile, some time ago a parachutist jumped out of a 
helicopter (which has since flown away). The parachutist is floating directly 
downward, 1000 feet due east of you, at a constant speed of 10 feet per second. 
The situation is summarized in the following picture: 



1000 


In the picture, what you might call the inter-azimuthal angle between the 
parachutist and the plane (with respect to you) is marked as 6. The question 
is, at what rate is 6 changing when the plane and the parachutist have the 
same height but the plane is 8000 feet due east of you? 

We have two objects to worry about, the plane and the parachutist. We 
know that the height of the plane is always 2000 feet (relative to your head), 
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but we don’t know how far east the plane is —— the distance keeps on changing. 
Let the plane be p feet to the east of you. As for the parachutist, this time 
we know exactly how far east the parachutist is: 1000 feet. The problem is, 
how high is the parachutist? Let the height be h feet. By drawing a few extra 
lines, we can recast the above diagram as follows: 



P 


Notice that the quantities 1000 and 2000 never change, but the quantities p 
and h do change. In particular, the plane is heading to the right, so p is getting 
bigger; and the parachutist is heading down, so h is getting smaller. Even 
though the question asks us to concentrate on the moment when p = 8000 
and h = 2000 (the same height as the plane), it’s important that we allow p 
and h to vary so that we can work out the rate of change. After all, if p and 
h stay the same, then the plane and the parachutist just stay suspended in 
space in the same spot, and of course the angle 6 wouldn’t change. That’s 
hardly realistic —— so we need to let p and h vary, in which case 6 varies and we 
can work out how fast it varies. That completes step 1. 

Speaking of 0, it’s clear from the diagram that it is simply the difference 
between the angle (3 the parachutist makes with the ground and the angle a 
the plane makes with the ground. (Let’s assume that you have no height, or if 
you prefer, you are lying on the ground.) So we know that 6 = /3—a. Actually, 
we should probably write 0 = \j3 — a\, just in case the parachutist is much 
lower than the plane. At around the time we’re interested in, the heights are 
the same but the plane is much farther to the east than the parachutist, so /3 
must be bigger than a and we don’t need the absolute values. 

Now, let’s do some trig. We have two right-angled triangles. From one of 
them (the one with the plane), we get tan(a) = 2000/p. From the other one, 
we have tan(/3) = /i/1000. Let’s write these equations down in one place: 

tan(a) = -y and tan(/3)= 而而 . 
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2000 dp 

- 了 H 

1 dh 
1000~dt 
d/3 da 
dt dt' 


tan(a)= 


2000 


tan ⑹ 


1000 
6 = 8 — a 


Now we’d better make some substitutions and get to the bottom of this mess. 
What do we know? Well, the speed of the plane is 500 feet per second, which 
means that dp/dt = 500. The speed of the parachutist is 10 feet per second, 
but the height is decreasing, so dh/dt = —10. If you forget the minus sign, 
you’ll get the answer wrong! So be very careful. For example, if the plane were 
coming toward you, then p would be decreasing, so dp/dt would be negative. 
Anyway, we’re interested in what happens when the plane is 8000 feet away, 
so p = 8000, and when the parachutist is at height 2000 feet (the same as the 
plane), so set h = 2000. The first four of our equations become a lot simpler: 


tan(a) 


2000 

8000 


2000 ^ 
tan (/ ? ) = T 7^= 2 


sec 2 (a) 


da 


2000 

'8000 2 


x 500 ： 


sec2(/3) f = 4o x( - 10) = -W 


Step 2 is finally done, and we can move on to step 3, dinerentiatmg these 
two relations implicitly with respect to time. Starting with the first one, let 
u = tan(a) and v = 2000/p, so our equation just becomes u = v. This means 
that du/dt = dv/dt. Let’s find these two quantities using the chain rule. First, 
du/dt: 

du du da 2 / \ 

m =sec ㈤ i 


And now for dv/dt: 


Since du/dt = dv/dt, we have 


2000 dp 
p 2 dt 


2( ,da _ 2000 dp 

sec {a) ~di = 


That’s just the first of our two trig equations. We need to repeat the exercise 
for the second one involving tan(/?). The left-hand side is handled exactly the 
same way as we did tan(a), but the right-hand side is much easier. Make sure 
you agree that we get 

sec2(/3) f = I^of- 

Remember, we also know that 0 = /? —a, so we can differentiate this also with 
respect to time t and get d6/dt = d(3/dt — da/dt. Since there are so many 
equations, let’s write all six of them down in one place: 


dal 也邸 I 也诎 I 也 



From the top right equation, we could find da/dt if only we knew what sec 2 (a) 
was. But wait a second — we do know that tan(a) = 1/4, so surely we can 
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find sec 2 (a). Remembering our trig identities (see Section 2.4 in Chapter 2 )， 
we get 

sec 2 (a) = 1 + tan 2 (a) = 1 + 

So the top right equation becomes 



17da 

16^ = ~64 , 

which works out to be 

da _ 1 

~dt ~ _ 68 ' 

This rocks —— we now need to do the same with (3 and we’ll be done. Here we 
know that tan(/3) = 2, so 

sec 2 (/?) = 1 + tan 2 (/?) = 1 + 2 2 = 5. 


Substituting into the bottom right equation above, we have 


5^ = 
dt 


100 


which means that 

d^_ _ —_ 1 _ 

~dt =_ 500* 

So we know da/dt and d(3/dt., from the final one of our original six equations 
above, 


dO dp da f 1 \ / 1 \ -17+125 27 

~dt = ~dt~~dt = V 500 ； ~ V 68y = — 8500 — = 2125* 

So the angle 6 is increasing at a rate of 27/2125 radians per second (at the 
moment we’re considering), and we’re finally done. 






CHAPTER » 


Exponentials and Logarithms 


Here’s a big old chapter on exponentials and logarithms. After we review 
the properties of these functions, we need to do some calculus with them. It 
turns out that there’s a special base, the number e, that works out particularly 
nicely. In particular, doing calculus with e x and log e (a:) is a little easier than 
dealing with 2 X or log 3 (ar), for example. So we need to spend some time 
looking at e. There are other things we want to look at as well; all in all, the 
plan is to check out the following topics: 

• review of the basics of exponentials and logs, and how they are related 
to each other; 

• the definition and properties of e; 

• how to differentiate exponentials and logs; 

• how to solve limit problems involving exponentials and logs; 

• logarithmic differentiation; 

• exponential growth and decay; and 

• hyperbolic functions. 

9.1 The Basics 

Before you start doing calculus with exponentials and logarithms, you really 
need to understand their properties. Basically, in addition to the actual def¬ 
inition of logs, you need to know three things: the exponential rules, the 
relationship between logs and exponentials, and the log rules. 

9.1.1 feview af.^kponantf^fs- 

The rough idea is that we’ll take a positive number, called the base, and raise 
it to a power called the exponent: 



For example, the number 2 - 5 / 2 is an exponential with base 2 and exponent 
—5/2. It’s essential that you know the so-called exponential rules, which 
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effectively tell you how exponentials work. You’ve seen these before, no doubt, 
but here they are again to remind you. For any base b > 0 and real numbers 
x and y: 

1. I b° = 1.1 The zeroth power of any nonzero number is 1. 

2. I b 1 = b. I The first power of a number is just the number itself. 

3. I b x b y = b x+y . I When you multiply two exponentials with the same base, 

you add the exponents. 


4. — = b x ~ y . When you divide two exponentials with the same base, 

I - 1 you subtract the bottom exponent from the top one. 

5. (b x ) y = b xy . When you take the exponential of the exponential, you 

multiply the exponents. 

You should also know what the graphs of exponentials look like. We looked 
at this a little in Section 1.6 in Chapter 1, but in any case we’ll revisit the 
graph shortly. 

9.1.2 Review of f^drithms 

Logarithms 一 a word that strikes fear into the hearts of many students. Watch 
carefully, and we’ll see how to deal with these beasts. Suppose that you want 
to solve the following equation for x: 

2 X = 7. 

The way you can bring x down from the exponent is to hit both sides with a 
logarithm. Since the base on the left-hand side is 2, the base of the logarithm 
is 2. Indeed, by definition, the solution of the above equation is 

x = log 2 ⑺. 

In other words, to what power do you have to raise 2 in order to get 7? The 
answer is log 2 (7). This particular number can’t be simplified, but how about 
log 2 (8)? Ask yourself, to what power do you raise the base 2 in order to get 
8? Since 2 3 = 8, the power we need is 3. So log 2 (8) = 3. 

Let’s go back to the equation 2 X = 7. We know that this means that 
x = log 2 (7). If we now plug that value of x into the original equation, we get 
the bizarre looking formula 

2iog 2 (7) = 7. 

In more generality, log b (y) is the power you have to raise the base b 
to in order to get y. This means that x = log b (y) is the solution of the 
equation b x = y for given b and y. Plugging this value of y in, we get the 
formula _ 

I 6 l0g b (2/) = y I 

which is true for any y > 0 and 6 > 0 (except b = 1). Hey, why do I insist 
that b and y be positive? First, if b is negative, then many weird things can 












The quantity b x may not be denned. For exam 
,then b x is (—l) 1 / 2 , which is y/^1 (urk). So w 
5 6 > 0. Then there’s no problem taking any powe 
is always positive! So if y = b x then y > Oby ne( 
nonsense to take the log of a negative number 


)u might also have noticed that I mentioned that 6 = 1 is bad. If you put 
in the formula b logb ^ = y from above, you get l lo Si( 2 /) = The problem 
raised to any power still equals 1 , but y may not be 1 ， so the equation 
L’t make sense. There just isn’t any base 1 logarithm. How about base 
That’s OK, but there’s rarely any need for a base 1/2 logarithm, since 
ns out that ^og 1 / 2 (y) = — log 2 ( 2 /) for any number y. (You can prove this 
fcting y = (1/2 产 and noting that y also equals 2~ x .) The same sort of 
is true for any base b between 0 and 1 : log b (y) = — log 1 / b (y) for all y, 
jb is greater than 1. So from now on, we’ll always assume that our base 
reater than 1 . 

arifiris, and 

m describe everything we’ve seen above in a more sophisticated manner 
ing inverse functions. Fix a base 6 > 1 and set f(x) = b x . The function 
domain R and range (0, oo). Since it satisfies the horizontal line test, it 
n inverse, which we’ll call g. The domain of g is the range of /, which 
oo), while the range of g is the domain of /, which is 1R. We say that g 
logarithm of base b; in fact, g(pc) = logjx) by definition. Remembering 
;he graph of the inverse function is the reflection of the original function 
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1. We’ll start off with = x. Since g is the logarithm function, x 

had better be positive — remember, you can only take the log of a positive 
number. Now let’s take a close look at the quantity f(g(x)). You start out 
with a positive number x, and hit it with which is the base b logarithm. You 
then exponentiate the result: that is, you raise b to the power of g(x). You end 
up with your original number! In fact, since f(x) = b x and g(x) = log 6 (a:), 
the formula f(g(x)) = x just says that 

6 1 。 點⑷ =x, 

which was one of our formulas from the previous section (with y replaced 

by x). The exponential of the logarithm is the original number — 

provided that the bases match! 

2. Our other fact is that which is true for all x. Now we take 

a number raise b to the power of our number then take the base b 
logarithm. Once again, we get the original number x back. It’s sort of like 
taking a positive number, squaring it and then taking the square root: you get 
the original number back. Since f(x) = b x and g(x) = log 6 (a:), the equation 
g(f(x)) = x becomes 

log b (b x ) = x for any real x and 6 > 1. 

For example, when we looked at the equation 2 X = 7 in the previous section, 
you can take log 2 of both sides to get 

log 2 (2 x ) = log 2 ⑺. 

The left-hand side is just x, because the logarithm of the exponential 

@ is the original number (provided that the bases match!). One more quick 
example: to solve 

3 1 " 2 - 1 = 19, 

simply take log 3 of both sides: 

logs (3? _1 ) = log 3 (19). 

The left-hand side is just ar 2 — 1， so we have x 2 — 1 = log 3 (19). This means 
that x = ±^/log 3 (19) + 1. 

9.1.4 Log rules 

The exponential rules from Section 9.1.1 above all have log versions, which are 
(strangely enough) called log rules. There’s actually an extra log rule — the 
change of base rule — that doesn’t have a corresponding exponential rule (see 
#6 below).* So, here are the rules, which are valid for any base b > 1 and 
positive real numbers x and y: 


* Actually, there is a change of base rule for exponentials too: b x = c xlogc ^ for 6 > 0, 
c > 1, and x > 0. This isn’t normally included in the list of exponential rules because it 
involves logarithms! 
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1. 1 iog b (i) = o7| 

2. I log b (b) = l.~| 

3. log b (a;?/) = log 6 (a:) + log 6 (?/). The log of the product is the 

sum of the logs. 

4. \og h (x/y) = log b (a;) — log b (y). The log of the quotient is the 

difference of the logs. 

5. logja^) = y\og b (x). The log moves the exponent down in front 

of the log. In this equation, y can be any real 
number (positive, negative or zero). 

6. Change of base rule: 



for any bases 6 > 1 and c > 1 and any number a; > 0. This means that 
all the log functions with different bases are really constant multiples of 
each other. Indeed, the above equation says that 

logb ㈤ = 肠 g c O), 

where K is constant (it happens to be equal to l/log c (6)). When I say 
“constant,” I mean it doesn’t depend on x. We can conclude that the 
graphs oiy = \og b (x) and y = \og c (x) are very similar — you just stretch 
the second one vertically by a factor of K to get the first one. 

Now, let’s see why these rules are all true. If you want, you can skip to the 
， next section, but believe me, you’ll understand logs a whole lot better if you 
read on. Anyway, #1 above is pretty easy: because 6° = 1 for any base 6 〉 1. 
r ~~I we have log b (l) = 0. The same sort of thing works for #2: since b 1 = b for 
any b > 1, we can just write down \og b (b) = 1. 

The third rule is harder. We must show that \og b (xy) = log b (x) +log b (y), 
where x and y are positive and 6 > 1. Let’s start off with our important fact, 
which we’ve noted a couple of times above (with A replacing the previous 
variable): 

b \o gh {A) = A 

for any A> 0. If we apply this three times with A replaced by x, y, and xy, 
respectively, we get 

& log b (x) = % b \og b (y) = and b \og b (xy) = xy 

Now you can just multiply the first and second of these equations together, 
then compare with the third equation to get 

^log b (a;)^log 6 (2/) _ X y _ ^log 6 (x2/)^ 

So what? Well, use exponential rule #3 on the left-hand side; since we have 
to add the exponents, the equation becomes 

^log b (x)+log b (y) _ ^log^xy) 
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Now hit both sides with a base b log to kill the base b on both sides; we’re 
left with our log rule log 6 (a:) + log 6 (?/) = \og b (xy). Not so bad! 

As for rule #4 above, I leave it to you to show this; the proof is almost 
identical to the one we just did for #3. So, let’s go on to #5. We want to 
show that logja^) = y\og b (x), where a: > 0, 6 > 1, and y is any number at 
all. To do this, start with the important fact from above but with A replaced 
by x y . We get 

6 iog b (x^) = x y^ 

This gives us a weird way of expressing x y . We could also replace A instead 
by x to get 

& log“x) = % 

then raise both sides to the power y: 

^ b log b (x)y = x y 

The left-hand side of this is just 6 ylog “ x ) by exponential rule #5 (see Sec¬ 
tion 9.1.1 above). So we have two different expressions for x y , which must be 
equal to each other: 

f ) log b (x y ) _ 


Again, hitting both sides with a logarithm base b reduces everything to our 
log rule 

log#) = ylog b (x). 

Finally, we just need to prove the change of base rule. We’re actually going 
to show that 

log b (x) log c (6) = log c (a:). 

You see, if that’s true, then just divide both sides by log c (6) to get the rule as 
it’s described in #6 above. Anyway, let’s take the equation above and raise c 
to the power the left-hand side and right-hand side separately. We get 

c logf> log c ⑼ an d c log c (a) 

respectively. The right-hand side is easy: it’s just x because of our important 
fact. How about the left-hand side? We use exponential rule #5 again in a 
tricky way to write 

c log b (a;)log c (6) _ c log c (6)xlog b (a;) _ ^ c log c (?0) logb ( X ) 

Since c logc ( b ) = 6 and 6 lo g“ x ) = a; by our important fact (twice), we conclude 
that 

c log b (x)log c (fe) _ ( c log c (&)y 0gb ( X ) = ^log^x) _ x 

So both of the quantities 

c log b (x) log c (6) anc j c log c (x) 


from above simplify down to just x\ They must be equal to each other, then, 
and if we knock out the base of c (using a base c logarithm), we get our desired 
equation 

log 6 (a;) log c (6) = log c (>). 

Well done if you took the trouble to understand all these proofs. 
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9.2 Definition of e 

So far, we haven’t done any calculus involving exponentials or logs. Let’s start 
doing some. We’ll begin with limits and then move on to derivatives. Along 
the way, we need to introduce a new constant e, which is a special number in 
the same sort of way that 丌 is a special number — it just pops up when you 
start exploring math deeply enough. One way of seeing where e comes from 
involves a bit of a finance lesson. 

9.2.1 A question 门 d 

A long time ago, a dude named Bernoulli answered a question about com¬ 
pound interest. Here’s the setup for his question. Let’s suppose you have a 
bank account at a bank that pays interest at a generous rate of 12% annu¬ 
ally, compounded once a year. You put in an initial deposit; every year, your 
fortune increases by 12%. This means that after n years, your fortune has 
increased by a factor of (l + 0.12) n . In particular, after one year, your fortune 
is just (1 + 0.12) = 1.12 times the original amount. If you started with $100, 
you’d finish the year with $112. 

Now suppose you find another bank that also offers an annual interest rate 
of 12%, but now it compounds twice a year. Of course you aren’t going to get 
12% for half a year; you have to divide that by 2. Basically this means that 
you are getting 6% interest for every 6 months. So, if you put money into this 
bank account, then after one year it has compounded twice at 6%; the result 
is that your fortune has expanded by a factor of (1 + 0.06) 2 , which works out 
to be 1.1236. So if you started with $100, you’d finish with $112.36. 

The second account is a little better than the first. It makes sense when 
you think about it —— compounding is beneficial, so compounding more often 
at the same annual rate should be better. Let’s try 3 times a year at the 
annual rate of 12%. We take 12% and divide by 3 to get 4%, then compound 
three times; our fortune has increased by (1 + 0.04) 3 , which works out to be 
1.124864. This is a little higher still. How about 4 times a year? That’d be 
(1 + 0.03) 4 , which is approximately 1.1255. That’s even higher. Now, the 
question is, where does it stop? If you compound more and more often at the 
same annual rate, do you get wads and wads of cash after a year, or is there 
some limitation on all this? 

9.2.2 answerourcfjgstion 

To answer our question, let’s turn to some symbols. First, let’s suppose that 
we are compounding n times a year at an annual rate of 12%. This means 
that each time we compound, the amount of compounding is 0.12/n. After 
this happens n times in one year, our original fortune has grown by a factor 
of 



We want to know what happens if we compound more and more often; in 
fact, let’s allow n to get larger and larger. That is, we’d like to know what 












n —> oo: what on earth is 


lim 

n—^oo 

It would also be nice to know what happens at interest rates other than 12%. 
So let’s replace 0.12 by r and worry about the more general limit 

L= lim + 

n^oo V nJ 

If this limit (which I called L) turns out to be infinite, then by compounding 
more and more often, you could get more and more money in a single year. 
On the other hand, if it turns out to be finite, we’ll have to conclude that 
there is a limitation on how much we can increase our fortune with an annual 
interest rate of r, no matter how often we compound. There would be a sort 
of “speed limit,” or more accurately, a “fortune-increase limit.” Given a fixed 
annual interest rate r and one year to play with, you can never increase your 
fortune by a factor of more than the value of the above limit (assuming it’s 
finite) no matter how often you compound. 

The quantity (1 + r/n) n which occurs in the limit is a special case of the 
formula for compound interest. In general, suppose you start with $^4 in cash 
and you put it in a bank account at an annual interest rate of r, compounded 
n times a year. Then over t years, the compounding will occur nt times at 
a rate of r/n each time; so your fortune after t years will be given by the 
following formula: 


fortune after t years, compounded n times a 
( r \ nt 

year at a rate of r per year = A —— J . 

So we are just starting with $1 (so A = l) and seeing what happens after one 
year (so t — 1), then seeing what happens in the limit if we compound more 
and more times a year. 

Now let’s attack our limit: 

L= lim + 

n—^oo \ uJ 

First, let’s set h = r/n, so that n = r/h. Then as n ^ oo, we see that /i 一 0+ 
(since r is constant), so 

L= lim(l + /i) r/ ' 

Now we can use our exponential rule to write 

L = ^ {{ i + h) ^y. 

Let’s pull a huge rabbit out of the hat and set 

e = lim (l + h) 1 ^. 
h^0+ 



Where is the trickery? Well, the limit might not exist. It turns out tha 
does; see Section A.5 of Appendix A if you want to know why. In any case, 
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have a special number e, which we’ll look at in more detail very soon. Back 
to our limit, though; we now have 

L = ^lim + ((l + / i ) 1 / ,l ) r = e r . 

That’s the answer we’re looking for! Let’s put all the above steps together to 
see how it flows. With h = r/n, we have 

L = Jirn (l + = ^lim + (l + hf/ h = ^lim + ((l + h ) 1 ， = e r . 

This means that if you compound more and more frequently at an annual 
rate of r, your fortune will increase by an amount very close to e r , but never 
more than that. The quantity e r is the “fortune-increase limit” we’ve been 
looking for. The only way you get this rate of increase is if you compound 
continuously~~that is, all the time! 

So, suppose you start with $A in cash and put it in a bank account which 
compounds continuously at an annual interest rate of r. After 1 year, you’ll 
have $Ae r . After two years, you’ll have $Ae r x e r = Ae 2r . It’s easy to keep 
repeating this and see that after t years, you’ll have %Ae rt . It actually works 
for partial years as well, because of the exponential rules. So, starting with 
we have: 


fortune after t years, compounded continuously 
at a rate of r per year = Ae rt . 

Compare this to the formula for compounding n times a year on the previous 
page. The quantities A(1r/n) nt and Ae rt look quite different, but for large 
n they’re almost the same. 

9,2,3 JMore about e and: logs' 

Let’s take a closer look at our number e. Remembering that 
lim (l + -) n =e r , 

n 一 oo \ n/ 

we can replace r by 1 to get 



Of course, r = 1 corresponds to an interest rate of 100% per year. Let’s draw 
up a little table of values of (1 H- l/n) n to three decimal places for some 
different values of n: 


n 

l 

2 

3 

4 

5 

10 

100 

1000 

10000 

100000 

(i+^r 

2 

2.25 

2.353 

2.441 

2.488 

2.594 

2.705 

2.717 

2.718 

2.718 


Even compounding once a year at this humongous interest rate doubles your 
money (that’s the “2” in the bottom row of the second column). Still, it 
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seems as if we can’t do much better than about 2.718, even if we compound 
many many times a year. Our number e, which is the limit as n —> oo of the 
numbers in the second row of the above table, turns out to be an irrational 
number whose decimal expansion begins like this: 

e = 2.71828182845904523... 


It looks like there’s a pattern near the beginning, with the repeated string 
“1828,” but that’s just a coincidence. In practice, just knowing that e is a 
little over 2.7 will be more than enough. 

Now if a: = e r , then r = log e (a:). It turns out that taking logs base 
e is such a common thing to do that we can even write it a different way: 
\n(x) instead of log e (a:). The expression “ln(a:)” is not pronounced “lin a:” or 
anything like that — just say “log or perhaps “ell en or if you’re feeling 
particularly geeky, “the natural logarithm of $•’’ In fact, most mathematicians 
write log ⑷ without a base to mean the same thing as log e ⑷ or ln(x). The 
base e logarithm is called the natural logarithm. We’ll see one of the reasons 
why it’s so natural when we differentiate log b (: r) with respect to x in the next 
section. 

Since we have a new base e, and a new way of writing logarithms in that 
base, let’s take another look at the log rules and formulas we’ve seen so far. 
See if you can convince yourself that the following formulas are all true for 
$ > 0 and y > 0: 


I e ln(a) = a-1 I ln(e x ) = a; | | In ⑴ = 0 | 


I In ㈣) =ln(±)->t ln(")] 



= ln(xf-ln(y) 


I ln(e) = 11 


I ln(a; v ) = y\n{x) I 



(Actually, in the second formula, x can even be negative or 0, and in the last 
formula, y can be negative or 0.) In any case, it’s really worth knowing these 
formulas in this form, since we will almost always be working with natural 
logarithms from now on. 

One more point before we move on to differentiating logs and exponentials. 
Suppose you take the important limit 


lim (l + -T = e r , 
n—KX) V Tl/ 


and this time substitute h = 1 /n. As we noticed in the previous section, when 
n — oo, we have /i — 0+. So, replacing n by l//i, we get 

lim (1 + rh) 1 ^ = e r . 
h^o+ 


This is a right-hand limit. In fact, you can replace /i — 0+ by ft — 0 and the 
two-sided limit is still true. All we need to show is that the left-hand limit is 
e r , and then, since both the left-hand and right-hand limits are the same, the 
two-sided limit equals e r as well. So consider 

lim (l+rh) 1 ^ =? 
h^0~ 
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Replace h by —t\ then t > 0+ as /i — 0 一 . (When h is a, small negative 
number, t = —h is a small positive number.) So 

+ rh) 1/h = ^1^(1 - 

Since A -1 = 1/A for any A 一 0, we can rewrite the limit as 


“o+ (l + (_r)f 

The denominator is just the classic limit but with interest rate —r instead of 
r. This means that in the limit as 亡 — 0+, the denominator goes to e~ r . So 
altogether we have 

^ - (1 + rk)1/h = ㉟ + (1 — rt)_Vt = ， (l + (-r) t )Vt = ^ - 


The last step works because e— r = l/e r . So we have shown what we want 
to show. Let’s change r to x in all our formulas (why not?) and summarize 
what we’ve found: 


and 


for e: 


and 


These are important! We’ll look at some examples at how to use them in 
Section 9.4.1 below. We’ll also use one of them to differentiate the log function, 
right now. 


=e. 


lim (1 + xh)" h =e x . 




When $ = 1， we get two formula 


lim 


9.3 Difefentiatiora of ；Logs and Exponentials 

Now the plot thickens. Let g(x) = log b (x). What is the derivative of gl Using 
the definition, 

^) = lim ^ + = ^log b (x + h^-lo gb (x) 

How do we simplify this mess? We use the log rules, of course! First, use rule 
#4 from Section 9.1.4 above to turn the difference of logs into the log of the 
quotient: 

作)=^ 10 & >(¥)• 

We can simplify the fraction down to (1 + h/x), but we also need to use log 
rule #5 to pull the factor 1/h up to be an exponent. So 








1 78 • Exponentials and Logarithms 


Forget about the log b for the moment. What happens to 




as h goes to 0? That is, what is 

/ /A"" 

lim ( 1 + - ) ? 

\ x ) 

In the previous section, we saw that 

lim (1 + hr) 1 ^ = e r ; 

so if we replace r by 1/x, then this leads to 

lim (l + -) 1/h = e 1 / x . 

h^O \ X ) 

So, if we go back to our expression for g f (x), we see that 
9 '(x) = limlog 6 (1 + 会 ) =log 6 (e 1/x ). 

In fact we can even make the expression simpler by using log rule #5 again- 
the power 1/x comes down out front and we have shown that 


dx 


log#) = -log b (e). 


Now, let’s set 6 = e, so that we are taking the derivative of the log function 
of base e. We get 

去 l Oge 0r) = > ge ( e ). 

But wait a second — by log rule #2, log e (e) is simply equal to 1. So this means 
that 

i loge(x) = l 

That’s pretty nice. It’s actually really really nice. Kind of amazing, really. 
Who would have thought that the derivative of log e (a:) is just 1/x? This is 
one of the reasons why the logarithm base e is called the natural logarithm. 
Writing log e (a;) as ln(a:) (we made this definition in the previous section), we 
get the important formula 


dx 


ln(a:)=—. 


Also, the above expression ^ log 6 (e) for the derivative of log 6 (a;) can be written 
in terms of natural logarithms by using the change of base formula (that’s #6 
in Section 9.1.4 above). You see, by changing to base e, we get 


沁&⑷ = 


log e (e) = _J_ 
log e (6) _ ln(6)' 
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So we have 



This is the nicest way to express the derivative of a logarithm of a base other 
than e. Now watch this: if y = then we know that x = \og b (y). Now 
differentiate with respect to y\ using the above formula with x replaced by y, 
we get 

dx _ 1 

dy yln(6)* 

By the chain rule, we can flip both sides to get 

尝 = yl ’_ 

Since y = b x , we have proved the nice formula 
4：^) = b^n(b). 


In particular, if 6 = e, then ln(6) = ln(e) = 1. (That is just log rule #1 in 
disguise — remember, ln(e) = log e (e) = 1.) So if 6 = e, this formula becomes 


This is a pretty freaky formula. If h{x) = e x , then h r {x) = e x as well — the 
function h is its own derivative! Of course, the second derivative of e x (with 
respect to x) is also e x , as are the third derivative, the fourth derivative, and 
so on. 


9.3.1 Examples of differentiating exponentials and logs 


◎ 


Now let’s look at how to apply some of the above formulas. First, if y = e~ 3x , 
what is dy/dxl Well, iiu = —3x, then y = e u . We have 


dy d 
du du 

By the chain rule, 


(e u ) = e u and 


du d . n . 

di = di^ x)= 


dy dy du 
dx du dx 


=e u {-3) = —3e_ 3x ; 


notice that we replaced u by —3x in the last step. In fact, this is a special 
case of a nice rule: if a is constant, then 


■e aa; = ae ox . 


This formula can be proved in the same way by letting u = ax. In fact, 
it’s exactly the same as the principle we saw at the end of Section 7.2.1 in 
Chapter 7: if x is replaced by ax^ then there is an extra factor of 
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◎ 


a out front when you differentiate. So it should be no problen 
example, to differentiate ln(8x) with respect to x. In fact, 


for 


since the derivative of ln(a:) with respect to a; is 1/x. Now, the factors of 8 
cancel and we see that 


◎ 


That’s weird —— the derivative of In(8a:) is the same as the derivative of ln(a:)! 
Not so weird when you think about it: ln(8$) = ln(8) + In ⑷， so in fact the 
quantities ln(8a:) and ln(a:) just differ by a constant and therefore have the 
same derivative with respect to x. 

Here’s a harder example: 

if y = e x2 log 3 (5 x — sin (: r)), what is 

Let’s use the product rule and the chain rule. Start off by setting u = and 
v = 10 运 3 (5$ — sin(:r)), soy = uv. For the product rule, we need to differentiate 
u and v (with respect to x), so let’s do them one at a time. Starting with 
u = e x2 , let ^ = x 2 so that u = e^; then, using the chain rule, we have 


t 丢峰 ) = 2 〆. 


As for v, let s = b x - 
dv _ dv ds 
dx ds dx 


- sin ⑷ so that v : 


log 3 (s). By the chain rule, 
cos(x))= 叫 5 )-— ) 


ln(3)(5 x _ sin(a:)) 


Here we’ve used the formulas from the previous section for the derivatives of 
log b (a:) (with 6 = 3) and b x (with b now equal to 5). Anyway, since y = uv, 
we have 




As usual, it’s a bit of mess, but the example does illustrate the main points 
involved; as long as you know the basic formulas for differentiating exponen¬ 
tials and logs (they are the boxed equations in the previous section), then 
you’ll be all set. 


9.4 How to Solve Limit Problems involving 
Exponentials or Logs 

Now it’s time to see how to solve a bunch of limit problems. As in the case of 
all the previous limits we’ve looked at, it’s really important to note whether 
you are evaluating functions near 0 (that is, at small arguments), near oo or 
—oo (large arguments), or somewhere else that’s neither small nor large. We’ll 
examine some of these cases in some detail with respect to exponentials and 
logarithms. Let’s start off, though, with limits involving the definition of e. 
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9.4.1 

◎ 


Limits involving the d^firiitian ： ©f e 

Consider the following limit: 

lim(l+3/i 2 ) 1 / 3h2 . 


It looks pretty similar to the limit involving e from Section 9.2.3 above: 
lim(l + /i) 1/ ^ = e. 


If we take this limit, and replace h by 3/i 2 everywhere we see it, then we get 

3 lim o (l + 3^)V^=e. 


◎ 

◎ 


This is almost exactly what we want. All we have to do is note that 3/i 2 —> 0 
as /i — 0, so 

lim (1 + 3/i 2 ) 1/3fe2 = e. 

Using the same logic, we can show (for example) that 
+ ㈨ =e. 

h—^0 

Indeed, if you replace h by any quantity that goes to 0 as /i ^ 0, such as 3/i 2 
or sin(/i), then the limit is still e. So how about 

lim(l + cos(/i)) 1/cosW ? 


◎ 


You can’t just repeat the previous argument, since cos(/i) — 1 as ft — 0. In 
fact, if you just substitute h = 0 into the expression (1 + cos(/i)) 1 / GOS ^^ then 
you get (1 + l) 1 = 2, so the above limit is in fact equal to 2. 

Now consider 

lim(l + / l 2 ) 1 /3^ 2 . 

h^O 


There is a mismatch between an h 2 term and a 3h 2 term. They are similar, 
but the coefficients aren’t the same. We need to write the exponent l/3/i 2 as 
(l/^MCl/^and use an exponential rule: 


lim(1 + h〒 3h2 = lim(l + = (( 絳 "YA 2 ) 1 ’ 3 . 



Since the h 2 terms match up, the part inside the big parentheses goes to e, 
and the whole limit is therefore e 1 / 3 . 

Here’s a slightly harder example: what is the value of 

lim(l - 汕 3 严 3 ? 


It’s annoying, but the small quantities —5h 3 and h 3 don’t quite match, and 
there’s also that 2. We need to match them up by fiddling with the exponent 
2/h 3 to match the —5h 3 term. The best way to look at it is to see how nice 
everything would be if we instead wanted to find 


lim(l - 5ft 3 ) 1 ， ( 肩， 











1 82 • Exponentials and Logarithms 


because this limit is just e. Yup, the —5/i 3 terms match and so this is nothing 
more than our classic limit 

lim(l + /i) 1 ^ = e , 

with h replaced by —5/i 3 . Unfortunately, we have to do a little more work. 
Somehow we need to turn 1/(—5/i 3 ) into 2//i 3 . To do that, we have to multiply 
by —5 to get rid of the —5 in the denominator, and then multiply again by 
2 to fix up the numerator. The overall effect is that we should multiply by 
—10. So, we have 

lim (1 - bh 3 ) 2 ^ = lim (1 - 5 / l 3 ) (i/(-5/ l 3 ))x(-io) 

=lim ((1 - = e~ 10 . 


9.4,2 Behavior of exponentials near 0 

We’d like to understand how e x behaves when x is really close to 0. In fact, 
since e° = 1, we know that 



Of course, you can replace x by another quantity that goes to 0 when a; ^ 0 
and get the same limit. For example, 


◎ 


◎ 


as well. So, we can find 


lim e x2 = e° 2 = 
e x2 sin(ar) 


lim - 

x—>0 


by splitting up like this: 


Hm e" sin ⑷ 
x 




Now, here’s 


Both factors tend to 1 as a: —> 0, so the overall limit is 1 x : 
a trickier example: 

1 . 2a: 2 + 3x — 1 

x-^-oo e 1 / x (x 2 — 7) 

As x gets very large, 1/x gets very close to 0; so e 1 ^ is very close to 1 and 
can be ignored. Your best bet is to write the limit as 

2a: 2 + 3$ — 1 


lim 


e l/x 


x 2 -7 


The first fraction goes to 1, and using the techniques from Section 4.3 of 
Chapter 4, you can show that the second factor goes to 2; so the limit is 2. 
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This sort of approach works well if your exponential term appears ir 
product or a quotient, but it fails miserably with something like this: 


h — h 

It’s tempting to replace the e h by 1, which is fair enough, except that you 
get a useless 0/0 case. The problem is that we have a difference between e h 
and 1, which gets very small when h is near 0. So what do we do? As we 
saw in Section 6.5 in Chapter 6, when the dummy variable is by itself on the 
bottom, your limit might be a derivative in disguise. Try setting f(x) = e x , 
so that f f (x) = e x as well (as we saw in Section 9.3 above). In this case, the 
standard formula 


f(x + h)~ f{x) 


becomes 


h 


0 x-\-h _ 


f\x) 


= e x . 


h ― ^0 h 

Now all we need to do is replace x by 0. Since e 0 = 
that 


we get the useful fact 


lim - 

h-^0 


Once again, you can replace h by any small quantity. For example, 
e 3« 5 _ i ^3s 5 _ -i 


lim 


5 5 


lim 

s — ^0 


3s 5 


x3 = lx3 = 3. 


The standard matching trick works; this is really the same trick we used 
in poly-type limits (Chapter 4)，trig limits where the arguments are small 
(Chapter 7), and the limits in Section 9.4.1 above. 

9.4.3 Behdviornear f 

Now let’s look at how logs behave near 1. It turns out that the situation is 
pretty similar to the case of exponentials near 0. We know that ln(l) = 0, 
but what is 

h ― h 

Believe it or not, this is another example of a limit which is a derivative 
in disguise (see Section 6.5 in Chapter 6). Set f(x) = ln(x) and note that 
f (x) = 1/x, as we saw in Section 9.3. Now the equation 


◎ 


lim 
h — >0 


f(x + h)~ f{x) 


/' ㈤ 


becomes 


ln(x -\-h) — ln(:r) 


for any x. All that’s left is to put x = 1 and get 
ln(l -\-h) — ln(l) 
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Since ln(l) = 0, this simplifies to 


Hm ln(l ± ft) =i 

h^O h 


Once again, h can be replaced by any quantity which goes to 0 as ft — 0 and 
the limit will still be 1. For example, to find 

ln(l- 7h 2 ) 

鰓 阳 ， 

you have to mess with the denominator to make it look like —7h 2 as follows: 

v ln(l- 7h 2 ) v ln(l- 7h 2 ) —7h 2 

I™ ~~ = 1™ -7h 2 ~ X 

It’s just our old trick of multiplying and dividing by a useful quantity (—7h 2 
in this case). Anyway, the first fraction has limit 1 since the small quantity 
—7h 2 matches, and the second fraction just cancels down to be —7/5. So the 
limit is —7/5. 

9.4.4 Behavior of exponentials near oc or — oc 

Now we want to understand what happens to when x oo or x —> —oo. 
Let’s take another look at the graph of y = e x : 



Beware: the curve above looks as if it touches the x-axis at the left side of 
the graph, but it doesn’t; remember, e x > 0 for all a:, so there are no x- 
intercepts. (This is a good argument against relying too strongly on graphing 
calculators in order to understand what’s going on!) In any case, it seems 
that we should at least have 



lim e x = oo 

and 

lim e x = 0. 

x-^oo 


X4-00 


What if the base e is replaced by some other base? For example, consider 
lim 2 X and lim . 

x—^oo x—^oo / 
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To handle the first one, let’s use the identity A = e ln ( A ) with A = 2 X to write 


2^ = e H^ x ) = p ln ( 2 ). 


Now as a: —> oo, we also have x\n(2) —> oo, so the first limit is oo. As for the 
second limit, this time we can use the same trick to write 


GI 


3^ _ e zln(3). 


As a: —> oo, we see that 己士⑶ — oo, so the reciprocal goes to 0. We have 
proved that 

lim 2 X = oo and lim () = 0. 

x-^-oo x—^oo / 

These are special cases of the following important limit: 


( 00 

if r > 1, 

lim r x = < 1 

if r = 1, 

1° 

if 0 < r < 1. 


The middle case, when r = 1, is obvious, since l x = 1 for all a: > 0. The other 
two cases can be shown in the same way as we handled the limits of 2 X and 
(l/3) x abov^-just write r x as e xln ( r ). 

This is not the whole story. The limit 


lim e x = oo 

says that e x gets larger and larger — as large as you want — when x gets larger; 
but how fast does this happen? After all, 

lim a: 2 = oo 

cc—oo 

as well. Which one grows faster, x 2 or e x l The answer is that e x kicks butt 
over x 2 when x is large. After all, when x = 100, the quantity a: 2 is only 
100 x 100, while 

e 100 = e x e x ••• x e. 

There are a hundred factors of e but only two factors of 100, so e 100 is much 
bigger than 100 2 . The situation is even more in favor of e x when x is larger 
still. Since e x is so much bigger than x 2 , when you divide x 2 by e x you should 
get a tiny number. In fact, 

lim — — 0. 

We won’t prove this until we look at PHopitaPs Rule in Chapter 14. For the 
moment, I want to point out that the above limit is also true if you replace 
x 2 by any power of x. Even x 999 can’t compete with e x . When a: is a billion, 
X 999 is 999 copies of a billion, multiplied together — but e x is a billion copies 
of e, multiplied together! Even though e is a lot smaller than a billion, e x just 
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walks all over x 999 in terms of size when x is large. So in general we have the 
following principle: 


Exponentials grow quickly: 


lim J = 0 


no matter how large n is. 


◎ 

◎ 

◎ 

◎ 

◎ 


In fact, by tweaking this a little, you can get a more general statement: 

lim _poly-type stuff _ = Q 

exponential of large, positive poly-type stuff 

For example, 

x 8 + 100a: 7 - 4 ^ 

lim --- =0. 

To see why, simply split up the fraction into three pieces, each of which goes 
to 0 because exponentials grow quickly. More subtly, 

- 10000 + 300/ + 32 


lim 


= 0 . 


e 2a; 3 -19a; 2 -100 

Here the crucial fact is that 2x s — 19x 2 — 100 behaves like 2x 3 when x is large, 
so the exponential is indeed of large, positive poly-type stuff.* In fact, the 
base e can be replaced by any other base greater than 1. For example, 

^ioooo + 300^9 + 32 。 

}^o 2 2x3 - 19x2 - 100 = ° 

as well. Another variation involves the fact that e~ x is just another way of 
writing l/e x . Here’s an example of this: 

lim Or 5 + 3) 101 e_' 


We can just write this as 


lim (x 5 + 3) 101 e _;r — lim 


(x^S ) 1 


0; 


here the limit is 0 because exponentials grow quickly. Now consider the very 
similar limit 

^Hrn^^ + S) 10 ^ 31 . 

This of course involves the behavior of e x near —oo, but you can just throw 
the situation over to +oo by setting t = —x. We can see that as a: —oo, 
we have t +oo. So 

H 5 +3) 101 _ n 


lim (x 5 + 3) 101 e x = lim ((—t) 5 + 3) 101 e _t = lim 


t—oo e* 

Once again, the limit is 0 because the numerator is a polynomial (it doesn’t 
matter that its leading coefficient is negative). So you can deal with e x as 
x —oo by substituting t = —x\ this means that you now have to deal with 
e -t as t ^ oo, and you just handle that by writing e~ l as l/e*. 


*If you really want to nail it, you must write something clever like 2a; 3 — 19a; 2 — 100 > x 3 
for large enough x. After all, if 2x 3 — 19x 2 — 100 behaves like 2a; 3 , then clearly it must 
eventually be larger than x 3 . So our denominator is bigger than e x . Now replace x 3 by 
u, so that the denominator is just e u and the numerator is some easy-to-deal-with mess. 
Finally, use the sandwich principle. 
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9.4.5 Behavior of logs near oc 

The saga continues. Let’s look at what happens to ln(x) when a: is a large 
positive number. (Remember, you can’t take the log of any negative number, 
so there’s no point in studying the behavior of logs near —oo!) Here’s the 
graph of y = ln(a:) once again: 



Again, it’s important to note that the curve never touches the y-axis, even 
though it looks as if it does. It just gets very, very close. In any event, it 

seems as if _ 

lim \n(x) = oo. 


This is actually easy to show directly. Do you believe that ln(a;) ever makes 
it up to 1000? Sure, it does: ln(e 1000 ) = 1000. The same trick works for any 
number N• Just take x = e N and you will find that ln(ar) = ^(e^) = N. So 
there’s no limit to how big ln(x) gets: it goes to oo as x ^ oo … but how 
fast? 

It’s pretty easy to see that it must be quite slow. As we just noted, 
ln(e 1000 ) = 1000. The number e 1000 is positively humongous — much greater 
than the number of atoms in the universe — yet its log is only 1000. Talk about 
cutting things down to size! 

More precisely, it turns out that \n(x) goes to infinity much more slowly 
than any positive power of x, even something like a: 0 0001 . So if you take 
the ratio of ln(x) to any positive power of x, the ratio should be small (at 
least, when x is very large). In symbols, we have 


Logs grow slowly: 


ifa>0, lim ^=0 
x^oo x a 


no matter how small a is. 


Just as in the case of exponentials, it’s not too hard to extend this to a more 
general form: 


lim log of an y positive poly-type stuff _ ^ 
^oo poly-type stuff of positive “degree” * 
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This works for logs of any base 6 > 1， not just the natural logarithm. (That’s 
because of the change of base rule.) For example, 

= 0 


log 7 (o: 3 + 3x - 1) 
x^o x 01 - 99 


even though the power a: 0-1 is very small. 

Actually, we shouldn’t be surprised that logs grow slowly, once we know 
that exponentials grow quickly. After all, logs and exponentials are inverses 
of each other. More precisely, if you take ln(x)/x a and replace x by e*, you 


get 


lim 

cc—oo 


In ⑷ 

x a 



= 0 . 


The last limit is 0 because the exponential e at on the bottom grows much 
more quickly than the polynomial t on the top. So we have shown that the 
fact that exponentials grow quickly automatically leads to the fact that logs 
grow slowly. 


9.4.6 Behavior of logs near 0 

It’s tempting to write ln(0) = —oo, but it’s just not true: ln(0) is not defined. 
On the other hand, the graph of y = ln(a:) above suggests that 




You need to use the right-hand limit here, since ln(a:) isn’t even defined for 
a: < 0. Once again, though, we need to say more. Sure, ln(x) goes to —oo as 
: c — 0+， but how quickly? For example, consider the limit 

lim xlnfa:). 

X—^0+ 


If you just plug in 0, it doesn’t work at all, since ln(0) doesn’t exist. When x 
is a little bigger than 0, the quantity x is small and ln(a:) is a large negative 
number. What happens when you multiply a small number by a large one? It 
could be anything at all, depending on how small and how large the numbers 
are. 

Here’s one way to solve the above problem. Replace x by 1/t. Then as 
a: — 0+， we can see that t — oo. So we have 


lim x ln(a;) 




Of course, ln(l/t) is just ln(l) — In ⑷， which equals — ln(t), since ln(l) = 0. 
So we get 


lim + x ln(a:) 


=lim W lim ㈣ 


= o, 


where the limit is 0 because logs grow slowly. 
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The trick of replacing xby 1/t to transfer the behavior near 0 to behavior 
near oo works because \n(l/t) = — In ⑴. You can use it to show the following 
principle, of which the above example is a special case: 


Logs “grow” slowly at 0: 


(I put “grow” in quotation marks because ln(a:) really grows downward to 
—oo as $ — 0+.) Once again, you can replace x a by poly-type stuff, as long 
as it becomes small when x 0 + , and “In” can be replaced by “log b ” for any 
other base 6 > 1 (that is, not just the base e). 


if a > 0, lim + x a In ⑷ = 0 


no matter how 
small a is. 


9.5 Logarithmic Differentiation, 


◎ 


Logarithmic differentiation is a useful technique for dealing with derivatives 
of things like f(x) 9 ^ x \ where both the base and the exponent are functions of 
x. After all, how on earth would you find 


dx 


( 户 ⑻ I 


with what we have seen already? It doesn’t fit any of the rules. Still, we have 
these nice log rules which cut exponents down to size. If we let y = x sin ( x ), 
then 

ln(t/) = ln(o; sin ( x )) = sin(a:) ln(o:) 

by log rule #5 from Section 9.1.4 above. 

(implicitly) with respect to x: 


Now let’s differentiate both sides 


^( ln (2/)) = ^(sin(a;) ln(a;)). 

Let’s look at the right-hand side first. This is just a function of x and re¬ 
quires the product rule; you should check that the derivative works out to be 
cos(a:) In (: r) + sin(a:)/a:. Now let’s look at the left-hand side. To differentiate 
ln(y) with respect to x (not y!)，we should use the chain rule. Set u = ln(y), 
so that du/dy = 1/y. We need to find du/dx' by the chain rule, 

du _ dudy _1 dy 
dx dy dx y dx 

So, implicitly differentiating the equation ln(y) = sin(a;) ln($) produces 

i?=cos(x)ln (a： ) + ^. 
y dx x 

Now we just have to multiply both sides by y and then replace y by a ； sin ⑷： 

丄⑷ 1咖 + 亨) 户⑷. 


(cos ⑷ ln(x) H - ~^ 
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That’s the answer we’re looking for. (By the way, there is another way we 
could have done this problem. Instead of using the variable y, we could just 
have used our formula A = e ln ⑷ to write 

x Bin(x) = e ln(x sin(x) ) = yin ⑻ In ⑷. 

Now I leave it to you to differentiate the right-hand side of this with respect 
to x by using the product and chain rules. When you’ve finished, you should 
replace e sin ( x ) ln ( x ) by $ sin ( x ) and check that you get the same answer as the 
original one above.) 

Let’s review the main technique. Suppose you want to find the derivative 
with respect to x of 

y = m^\ 


where both the base / and the exponent g involve the variable x. Here’s what 
you do: 



1. Let y be the function of x you want to differentiate. Take (natural) logs 
of both sides. The exponent g comes down on the right-hand side, so 
you should get 

ln(y) = g(x)ln(f(x)). 

2. Differentiate both sides implicitly with respect to x. The right-hand 
side often requires the product rule and the chain rule (at least). The 
left-hand side always works out to be (l/y){dy/dx). So you get 



in x. 


3. Multiply both sides by y to isolate dy/dx, then replace y by the original 
expression f(x) 9 ^ x \ and you’re done. 


Here’s another example: what is 

丟卜 2 ) v 勹? 


According to the first step, we let y = (1 x 2 ) 1 ^ 3 , then take logs of both 

sides, bringing the exponent down; we get 

.矿)+ (1 岭1^1. 


ln(j/) = In ((] 


The second step is to differentiate both sides implicitly with respect to x. 
The left-hand side, as always, becomes 、 \/y){dy/dx ')，but we’ll have to use 
the quotient rule on the right-hand side. First, differentiate z = ln(l + x 2 ) 
using the chain rule: if w = 1 + ar 2 , then : = ln(w), so 


dz_dzdu_l 2x 

dx du dx u 、 l-\- x 2 


Now you can use the quotient rule; you should check that when you implicitly 
differentiate the equation ln(y) = ln(l + x 2 )/x s from above, you get (after 
simplifying) 


ldy = x 3 TT^~ 3x2H1 + x2) = 2x 2 -3(l + a ； 2 )ln(l+x 2 ) 
y dx (x 3 ) 2 x 4 (l + x 2 ) 
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Finally, multiply through y and replace y by (1 + a: 2 ) 1 " 3 to get 

dy 一 (2x 2 — 3(1 + x 2 ) ln(l + x 2 ))y 
dx x 4 (l + x 2 ) 

=(2x 2 -3(l + x 2 )ln(l + x 2 ))(l + x 2 ) 1 ^ 3 

X 4 (l + X 2 ) 

_ (2rr 2 -3(l + a: 2 )ln(l + a: 2 )) 

= ^(l + a: 2 ) 1 - 1 /- 3 


and we’re all done. 

Even if the base and exponent are not both functions of x, logarithmic 
differentiation can still come in handy. If your function is really nasty and 
involves lots of products and quotients of powers (like x 2 ) and exponentials 
(like e x ), you might want to try logarithmic differentiation. For example, 

• r (x 2 - 3) 100 3^ , . d Vo 

l{ y ~ What 1S 石. 

I must be joking, right? How can you be expected to differentiate something 
so foul? By logarithmic differentiation, that’s how. Just take natural logs 
of both sides, and you’ll find that the right-hand side becomes much more 
manageable (provided that you remember your log rules), like this: 


1參( 2 以;二 ) 


=ln((a: 2 - 3) 100 ) + ln(3 sec ( x) ) - ln(2) - ln(a: 5 ) - ln((log 7 ⑷ + cot(a:)) 9 ) 
=100 ln(a; 2 — 3) + sec(x) In ⑶ — ln(2) — 5 \n(x) — 91n(log 7 (a;) + cot (a;)). 


Make sure you understand these log manipulations before reading on. Any¬ 
way, now we can differentiate this expression implicitly with respect to x 
without too much drama: 

去 (ln(y))= 去 (100 ln(x 2 - 3) + sec(a:) ln(3) 

— ln(2) — 51n'(x) - 9 ln(log 7 (x) Hr cot (a:))). 

The left-hand side is (l/y)(dy/dx) as usual, so let’s take a look at the right- 
hand side, term by term. 

• The first term is 100 ln(a: 2 — 3); it’s a straightforward chain rule exercise 
to see that the derivative is 100 x (2x)/(x 2 — 3), which is of course 
200x/(a: 2 -3). 

• The second term is sec(x) ln(3). Before you whip out the product rule, 
remember that In(3) is a constant, so in fact you can just take the 
derivative of sec(x) and then multiply by ln(3) to get ln(3) sec(x) tan(a:). 

• The third term is — ln(2), which is a constant, so its derivative is just 0. 

• The fourth term is — 51n(x), which has derivative —5/a:. 

• The fifth term, — 91n(log 7 (x) + cot (a:)), which I’ll call z, requires the 
chain rule. Here are the details, although you should be able to work 
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this out for yourself. Let u = log 7 (a:) + cot (a:), so z = — 91n(-u). Then 
we have 

dz dz du 9 / 1 

dx du dx u \xln(7) 

_ 9 

log 7 (x) + cot(x) 

Let’s put it all together to get 


( csc2 ㈤- ▲)• 



y dx 


200x 


- ln(3) sec ⑻ tan(x) - 

x 

log 7 (x) + cot(x) ( csc ⑷ $ln(7)) 


Now multiply by y to get 

% = + ln ( 3 ) sec ^) tan ^) - 1 

+ log 7 (x) + cot(x) (CSC 2 ⑷ - ^(7))) X y- 
Finally, replace y by the original (horrible) expression to get 


dy — ( 200x 
dx ' ^ 


(+ l n (3) sec ⑷ tan(a;) —— 
9 (… 


(x 2 - 3) 100 3 sec ( x ) 
2x 5 (log 7 (a;) + cot ⑻ ) 9 


It seems nasty, but just imagine trying to do it without logarithmic differen¬ 
tiation! 


9,5.1 derivative of x a 

Now we can finally show something that we’ve been taking for granted: 

for any number a, not just integers as we’ve seen before. Let’s suppose x > 0. 
Now use logarithmic differentiation: set y = x a , so that ln(y) = a\n(x). If 
you differentiate both sides implicitly, you get 

1 dy a 
y dx x 

Now multiply both sides by y and replace y by x a : 

dv = ay = ax^ = axa _ 1 
dx x x 

This is exactly what we want, at least when a; > 0. When x < 0, we have a 
bit of a problem. For example, you can’t even take (—l) 1 / 2 because this is the 









Section 9.6: Exponential Growth and Decay • 1 93 



square root of a negative number. So what on earth should (—1)^ be? In 
fact, without using complex numbers (after all, we won’t look at these until 
Chapter 28)，you can only make sense of x a for a; < 0 when a is a rational 
number with an odd denominator (after canceling out common factors). For 
example, a: 5 ’ 3 makes sense for negative x since you can always take a cube 
root — we’re OK because 3 is odd. In the case where x a makes sense for x <0, 
it turns out that it’s either an even or an odd function of x\ you can use that 
fact to show that the derivative is still a$ a 一 1 . 

Here are a couple of simple examples of using the formula. Working on the 
domain (0, oo), what is the derivative of with respect to xl How about 
Just use the formula to show that 


= V^x^- 1 and ■^{x w ) = 7TX n - 1 


for $ > 0. It’s not really any different from what we’ve done before — just that 
we can handle non-integer exponents now. 


9.6 Exponential Growth and 


We’ve seen that bank accounts with continuous compounding grow exponen¬ 
tially. We don’t need to look to such human-made devices to find exponen¬ 
tial growth, though: it occurs in nature too. For example, under certain 
circumstances, populations of animals, like rabbits (and humans!), grow ex¬ 
ponentially. There’s also exponential decay, where a quantity gets smaller 
and smaller in an exponential fashion (we’ll see what this means very soon). 
This occurs in radioactive decay, allowing scientists to find out how old some 
ancient artifacts, fossils, or rocks are. 

Here’s the basic idea. Suppose y = e kx . Then, as we saw at the beginning 
of Section 9.3.1 above, dy/dx = ke kx . The right-hand side of this equation 
can be written as ky, since y = e kx . That is, 


This is an example of a differential equation. After all, it’s an equation involv¬ 
ing derivatives. We’ll look at many more differential equations in Chapter 30, 
but let’s just focus on this one for the moment. What other functions satisfy 
the above equation? We know that y = e kx does, but there must be others. 
For example, if y = 2e kx , then dy/dx = 2ke kx , which is once again equal to 
ky. More generally, y = Ae kx , then dy/dx = Ake kx , which is once again 
equal to ky. It turns out that this is the only way you can have dy/dx = ky: 


if 


dy_ 

dx 


=%， 


then y = Ae kx for some constant A. 


We’ll see why in Section 30.2 of Chapter 30. In the meantime, let’s take a 
closer look at the differential equation dy/dx = ky. The first thing we’ll do is 
change the variable x to so that we are looking at 


dy 

dt 


=ky. 










ms means tnat tne rate oi cnange oi y is equal to Ky. interesting! 
lat the quantity is changing depends on how much of the quantity y 
_ you have more of the quantity, then it grows faster (assuming k > 
Lakes sense in the case of population growth: the more rabbits you 1: 
Lore they can breed. If you have twice as many rabbits, they also p 
vice as many rabbits in any given time period. The number fc ， 
died the growth constant, controls how fast the rabbits are breedir 
rst place. The hornier they are, the higher k is! 

: xponenfiol growth 

0 , suppose we have a population which grows exponentially. In syn 
J (or P(t), if you prefer) be the population at time t, and let k be th< 
)nstant. The differential equation for P is 

'his is the same as the differential equation in the box above, exc 
)me symbols have changed. Instead of y, we have P; and instead 
ave t. Never mind, we’re good at adapting to these situations; v 
Lake the same changes in the solution y = Ae kx . We end up with I 
>r some constant A. Now, when t = 0, we have P = Ae k ^ = Ae° = 
) = 1. This means that A is the initial population, that is, the po 
b time 0. It’s customary to relabel this variable as well. Instead of 
rite Po to indicate that it represents the population at time 0. Alt 
e have found the 


I exponential growth equation: P(t) = Ppe kt . | 


'emember, Pq is the initial population and k is the growth constant 
This formula is easy to apply in practice, provided that you kn 
cponential and log rules (see Sections 9.1.1 and 9.1.4 above). For e 
you know that a population of rabbits started 3 years ago at 1000, 
as grown to 64,000, then what will the population be one ye ； 

Iso, what is the total time it will take for the population to gr 
) 400,000? 

Well, we have Po = 1000, since that’s the initial popula 
^uation in the box above becomes P(t) = 1000e kt . The problei 
now what k is. We do know that P = 64000 when ^ = 3, so 1 

L ： 

64000 = 1000e 3fc . 

his means that e 3fc = 64. Take logs of both sides to get 3k 
=^ ln(64). Actually, if you write ln(64) = ln(2 6 ) = 61n(2), 
mplify down to fc = 21n(2). This means that 

P{t) = 1000e 21n(2)t 


»r any time t. Now we can solve both parts of the problem, 
art, we want to know what happens a year from now. This 
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years from the initial time, so set t = 4. We get 

04) = 1000e 21n(2)x4 = 1000e 81n(2) . 

Now we get a little tricky: write 81n(2) as ln(2 8 ) = ln(256), so 

P(4) = 1000e ln(256) = 1000 x 256 = 256000. 

Here we have used the crucial formula e ln ( A ) = A for any number A> 0. The 
conclusion is that the population will be 256,000 a year from now. Now let’s 
tackle the second part of the problem. We want to see how long it will take 
for the population to get up to 400,000, so set P = 400000 to get 

400000= 1000e 21n(2)t . 

This becomes e 21n ( 2 )* = 400. To solve this, take logs of both sides; we get 
2 \n(2)t = ln(400), which means that 

_ ln(400) 
t= 21n(2). 

This is the number of years it takes for the population to grow from 1000 
to 400,000, but it’s not very intuitive. You could use a calculator to get an 
approximation; but suppose you don’t have one handy. You just have to know 
that ln(5) is approximately 1.6 and ln(2) is approximately 0.7. Start off by 
writing 400 = 20 2 ，so ln(400) = ln(20 2 ) = 21n(20). We can do even better, 
though: ln(20) = ln(4 x 5) = ln(4) + ln(5) = 2 ln(2) + ln(5). All told, we get 

l n (400) — 2(21n(2) + ln(5)) lii(5) 

_ 21n(2) _ 21n(2) 一 ln(2) * 

Using our approximations, we get 

1.6 16 

^ 2+ a7 = 2 + y = 4 ^ 

So although it takes 4 years to get up to a population of 256,000, it only takes 
approximately two-sevenths of a year more — about 3* months — to get up to 
400,000. That’s the power of exponential growth - 

9,6.2 Expohertial . 

Let’s turn things upside-down and look at exponential decay. To set the 
scene, let me tell you that there are certain atoms which are radioactive. 
They are like little time bombs: after awhile they break apart into different 
atoms, emitting energy at the same time. The only problem is that you never 
know when they are going to break apart (we’ll say “decay” instead of “break 
apart”）. All you know is that over a given time, there’s a certain chance that 
the decay will happen. 

For example, you might have a certain type of atom which has a 50% 
chance of decaying within any 7-year period. So if you have one of these 
atoms in a box, close the box, and then open it up in 7 years, there’s a 50-50 
chance that it will have decayed. Of course, it’s pretty difficult to see an 
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individual atom! So let’s suppose, a little more realistically, that you have a 
trillion atoms. (That’s still a tiny speck of material, by the way.) You put 
them in the box and come back 7 years later. What do you expect to find? 
Well, about half the atoms should have decayed, while the other half remain 
intact. So you should have about half a trillion of the original atoms. What 
if you come back in another 7 years? Then half the remaining original atoms 
will be left, leaving you with a quarter of a trillion of the original atoms. 
Every 7 years, you lose half of your remaining sample. 

So let’s try to write down an equation to model the situation. If P(t) is 
the number (population?) of atoms at time then I claim that 


for some constant k. This says that the rate of change of P is a negative 
multiple of P. That is, P decays at a rate proportional to P. The more 
atoms you have, the faster the decay. This agrees with our above example: in 
the first 7 years, we lost half a trillion atoms, but in the next 7 years, we only 
lost a quarter of a trillion. In another 7 years, we’ll only lose one-eighth of a 
trillion atoms. The more we have, the more we lose. Anyway, the solution to 
the above differential equation is 


P(t) = P 0 e~ kt , 


where Po is the original number of atoms (at time t = 0). This is exactly 
the same as the equation for exponential growth from the previous section, 
except that we have replaced the growth constant A; by a negative constant 
—fc, which is called the decay constant. 

In the above example, we know that it takes 7 years for any sample of 
atoms to halve in size. This length of time is called the half-life of the atom (or 
material). In the above equation, this means that if you start with Pq atoms, 
then in 7 years, you’ll have |Po atoms. So, setting t = 7 and P(7) = in 
the above equation, we have 

^o = Poe~ m . 

Now cancel out the factor of Po from both sides and take the log of both sides 
to get 

ln (•) = _7fc _ 

Since ln(l/2) = ln(l) — ln(2) = — ln(2), the above equation becomes 


M2) 



This means that 

P(t) = P 0 e _t(ta(2)/7) 

in this case. 

Now let’s generalize a little. Suppose you have some other radioactive 
material with a half-life of ti /2 years. This means that half of any size sample 
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of the material will decay in ti /2 years. It doesn’t mean that the whole sample 
will decay in twice that many years! Anyway, by the same reasoning as in the 
previous paragraph, we can show that k = ln(2)/ii/ 2 . In summary, 


for radioactive decay with half-life 亡 1 / 2 ， P{t) = Poe~ kt 


with k : 


In ⑺ . 

亡 1/2 


For example, if the half-life of the material is still 7 years, and you start off 
with 50 pounds of the material, how much do you have after 10 years, and 
how long does it take before you are down to 1 pound of the material? We 
know ti /2 = 7, so k = ln(2)/7, as we saw before. Since Po = 50 (in pounds), 
the decay equation P{t) = Poe~ kt becomes 

P ⑴ = 50e—*( ln ( 2 )/ 7 ). 

So when t = 10, we have 

P(10) = 50e- loln(2)/7 . 

That is, we are down to 50e 一 101n ( 2 )" pounds. If we use our approximation 
ln(2) = 0.7 from above, then we see that we have approximately 50e 一 1 pounds, 
which we can further approximate to about 18.4 pounds. 

As for the second part of the question, now we need to find out how long 
it takes before we are down to one pound of material, so set P(t) = 1 in the 
above equation for P(t) to get 

1 = 50e_*( ln ( 2 )"). 

Divide both sides by 50 and take logs to get 

ln (-) 學 • 

Since ln(l/50) = — ln(50), we have — 71n(50) = —Hn(2); that is, 

_ 71n(5Q) 

“ In ⑺* 

We can estimate this using our previous approximations ln(5) = 1.6 and 
ln(2) ^ 0.7. Write ln(50) = ln(2 x 5 x 5) = ln(2) + 21n(5) to see that 

71n(50) — 7(ln(2) + 21n(5)) _ 141n(5) ^ 14(1.6) 

_ ln(2) ln(2) _ ln(2) — 0.7 ， 

which works out to be 39 years. So it takes approximately 39 years for the 
sample to decay from 50 pounds down to a single pound. By the way, 39 years 
is a little more than 5^ half-lives (since one half-life is 7 years). So if you have 
50 pounds of a different radioactive material with a half-life of 10 years, then 
this material will take a little more than 55 years to decay to 1 pound. (The 
actual number is 101n(50)/ln(2) years, which is closer to 56^ years.) 
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9.7 Hyperbolic Functions 


Let’s change course and look at the so-called hyperbolic functions. These 
are actually exponential functions in disguise, but they are similar to trig 
functions in many ways. We won’t be using them much but they do come up 
occasionally, so it’s good to be familiar with them. 

We’ll start by defining the hyperbolic cosine and hyperbolic sine functions: 


cosh(a:)= 


e x 


sinh(a:)= 


No triangles needed! This isn’t trigonometry, after all.* These functions 
behave somewhat like their ordinary cousins, but not exactly. For example, 
if you square cosh(a:) and sinh(a:), you find that 


〕 sh 2 ⑷ = 




e 2 "- 


+ 2 


and 


sinh 2 (a:)= 


g2a; _ 


o-2x 


-2 


(We used the fact that e x e~ x = 
two quantities: 


cosh 2 ⑷ —sinh 2 ⑻ 
So we’ve proved that 


e 2x 


Anyway, let’s take the difference of these 
+ 2 e 2^ + e -2x _ 2 4 


I cosh 2 ⑻ - sinh 2 (a:) = 11 

for any x. Not quite the same as the regular old trig identity — the minus 
makes all the difference. (Indeed, x 2 — ?/ 2 = 1 is the equation of a hyperbola.) 

How about calculus properties? Well, let’s differentiate y = sinh(x); we’ll 
need the fact that the derivative of e~ x is —e~ x : 


dx 


sinh(a:) 


d (e x -e~ x \ 

= d^{-^—) = 


e x - 


cosh(a:). 


So the derivative of hyperbolic sine is hyperbolic cosine. That’s just like what 
happens with regular old sine and cosine. On the other hand, 


3Sh(a:) = i = = 祉 ㈤. 


If these were ordinary trig functions, then the derivative would be negative 
hyperbolic sine, but we don’t have a negative here. In any case, we have 
shown that 


sinh(a:) = cosh(a:) 


and 


dx 


cosh(a:) = sinh ⑻. 


* There is actually a branch of geometry called hyperbolic geometry^ in which the 
gles have wacky properties that lead to hyperbolic functions. 
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S Now let’s look at the graphs of these functions. First, you should try to 
/ convince yourself that cosh(a:) is an even function of x and that y = sinh(a;) is 
an odd function of x. (Just plug in —x and see what happens.) Furthermore, 
cosh(O) = 1 and sinh(O) = 0 (check this too). Finally, let’s note that 


lim cosh(x) = lim 

IC—^OO X—*(X. 


e x + e _x 

~ 2 ~ 


The term e x goes to oo, but e~ x goes to 0. The overall effect is that the limit 
is oo. The same thing works for sinh(x), so our graphs must look something 
like this: 




Of course you can define tanh(x) as sinh(a:)/cosh ⑷， as well as the reciprocals 
sech(a:), csch(a:), and coth(a;). Each of the functions sech, csch, and coth can 
be differentiated by replacing them with the appropriate exponentials — for 
example, 


ech(a:)= 


cosh(x) e x + e~ x 


which you can then differentiate using the chain rule or the quotient rule. 
There are also identities connecting the functions, the most important of 
which is 

1 — tanh 2 ⑷ =sech 2 ⑷ • 

This follows directly from the identity cosh 2 (a:) — sinh 2 (a;) = 1 by dividing 
both sides by cosh 2 (x). Now I’m just going to list the derivatives of the other 
hyperbolic functions and display their graphs — I leave it to you to check that 
the derivatives all work out and that the graphs at least make sense. First, 















CHAPTER 10 _ 

Inverse Functions and Inverse Trig Functiani 


In the previous chapter, we looked at exponentials and logarithms. We got a 
lot of mileage out of the fact that e x and ln(a:) are inverses of each other. In 
this chapter, we’ll look at some more general properties of inverse functions, 
then examine inverse trig functions (and their hyperbolic cousins) in greater 
detail. Here’s the game plan: 

• using the derivative to show that a function has an inverse; 

• finding the derivative of inverse functions; 

• inverse trig functions, one by one; and 

• inverse hyperbolic functions. 

1.0.1 The Derivative and Inverse Fi|_3t_is 

In Section 1.2 of Chapter 1, we reviewed the basics of inverse functions. I 
strongly suggest you take a quick look over that section before reading further, 
familiarizing yourself with the general idea. Now that we know some calculus, 
we can say more. In particular, we’re going to explore two connections between 
derivatives and inverse functions. 

TQ.1.1 Using the derivative to show tsn inverse ^ists? 

Suppose that you have a differentiable function / whose derivative is always 
positive. What do you think the graph of this function looks like? Well, the 
slope of the tangent has to be positive everywhere, so the function can’t dip 
up and down: it has to go upward as we look from left to right. In other 
words, the function must be increasing. 

We’ll prove this fact in the next chapter (see Section 11.3.1 and also Sec¬ 
tion 11.2), but it at least seems clear that it should be true. In any case, if 
our function / is always increasing, then it must satisfy the horizontal line 
test. No horizontal line could possibly hit the graph oiy = f(x) twice. Since 
the horizontal line test is satisfied by /, we know that / has an inverse. This 
has given us a nice strategy for showing that a function has an inverse: show 
that its derivative is always positive on its domain. 
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◎ 


For example, suppose that 


f(x) = -x s — x 2 -\-5x — 11 


on the domain R (the whole real line). Does / has an inverse? It would be a 
real mess to switch x and y in the equation y = ^x 3 — x 2 -\-5x — 11 and then 
try to solve for y. (Try it and see!) A much better way to show that / has an 
inverse is to find the derivative. We get 


f'{x) = x 2 -2x-\-5. 

So what? Well, f r is just a quadratic. Its discriminant is —16, which is 
negative, so the equation f f (x) = 0 has no solutions. (See Section 1.6 in 
Chapter 1 for a review of the discriminant.) That means that f f (x) must 
be always positive or negative: its graph can’t cross the a:-axis. Well, which 
is it — positive or negative? Since /’(0) = 5 ， it must be positive;* that is, 
f r (x) > 0 for all x. This means that / is increasing. In particular, / satisfies 
the horizontal line test, so it has an inverse. 

We’ve seen that if f ; (x) > 0 for all x in the domain, then / has an in¬ 
verse. There are some variations. For example, if f f (x) < 0 for all x, then the 
graph y = f(x) is decreasing. The horizontal line test still works, though — 
the graph is just going down and down, so it can’t come back up and hit 
the same horizontal line twice. Another variation is that the derivative might 
be 0 for an instant but positive everywhere else. This is OK as long as the 
derivative doesn’t stay at 0 for a long time. Here’s a summary of the situation: 

_ Derivatives and inverse —s: if / is disable on its d 0ma in 
I (a, b) and any of the following are true: 

1. f r (x) > 0 for all x in (a, 6); 

2. f f (x) < 0 for all x in (a, 6); 

3. f f (x) > 0 for all x in (a, b) and f f (x) = 0 for only a finite number of x\ 

or 



4. f f (x) < 0 for all x in (a, b) and f f (x) = 0 for only a finite number of x, 

then / has an inverse. If instead the domain is of the form [a, 6], or [a, 6), or 
(a, 6], and / is continuous on the whole domain, then / still has an inverse if 
any of the above four conditions are true. 

Here’s another example. Suppose g(pc) = cos(a:) on the domain (0,7r). 
Does g have an inverse? Well, g f {x) = — sin(x). We know that sin(a:) > 0 
on the interval (0, 丌 ) 一 just look at its graph if you don’t believe this. Since 
g r {x) = — sin(a:), we see that g r (x) < 0 for all x in (0,7r). This means that g 
has an inverse. In fact, we know that g has an inverse on all of [0,7r], since g is 
continuous there. The idea is that 夕 (0) = 1, so p starts out at height 1; then, 
since g r {x) < 0 when 0 < a: < 7r, we know that g immediately gets lower than 
1. Since g(7r) = —1, the values of g{x) go down to —1 without ever hitting 


* Another way to show this is to complete the square: x 2 — 2x -\-b = (x — l) 2 + 4 > 0, 
since all squares (such as (x — l) 2 ) are nonnegative. 
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the same value twice. So g has an inverse on all of [0 ，丌 ]. We’ll come back to 
this particular function in Section 10.2.2 below. 

Finally, let h(x) = x s on all of R. We know that h f (x) = 3a: 2 , which can’t 
be negative. So h f (x) > 0 for all x. Luckily, h f (x) = 0 only when x = 0, so 
there’s just one little point where h r {x) = 0. That’s OK, so h still has an 
inverse; in fact, h~ 1 (x) = ^/x. 


10.1.2 Derivatives and inverse functions: what can go wrong 



We noticed that the derivative of our function is allowed to be 0 occasionally 
and the function can still have an inverse. Why can’t f r (x) = 0 a little more 
often? For example, suppose that / is defined by 


f{x)= 



if $ < 0, 
if 0 < a; < 1, 

if a: > 1. 


When a: < 0, we have f(x) = —2x, which is positive (since x is negative!). 
When 0 < a: < 1, we have f f (x) = 0; and when : r > 1, we can see that 
f f (x) = 2x — 2 = 2(x — 1), which is certainly positive. Also, the function 
values and derivatives both match at the join points a: = 0 and a: = 1, so 
we’ve shown that / is differentiable and / 7 (a:) > 0 for all x. (See Section 6.6 
in Chapter 6 to review why this works.) Unfortunately the horizontal line 
test fails, and there is no inverse! Check out the graph: 



The horizontal line y = I hits this graph infinitely often~~everywhere between 
a: = 0 and x = 1 inclusive. The function / is constant on [0,1], which is 
consistent with the fact that f f (x) = 0 for these x. 

Here’s another potential problem. The four conditions on the previous 
page all require that the domain be an interval like (a, b). What if the domain 
isn’t in one piece? Unfortunately, then the conclusion can totally fail to hold. 
For example, if f(x) = tan(a:), then f(x) = sec 2 (a:), which can’t be negative; 
however, you can see from the graph that y = tan(x) fails the horizontal line 
test pretty miserably. (See Section 10.2.3 below to remind yourself about the 
graph oiy = tan(x).) So the methods of the previous section won’t work, in 
general, when your function has discontinuities or vertical asymptotes. 
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10.1.3 Finding the derivative of an inverse function 

If you know that a function / has an inverse, which we’ll call / 一 1 as usual, 
then what’s the derivative of that inverse? Here’s how you find it. Start 
off with the equation y = You can rewrite this as f(y) = x. Now 

differentiate implicitly with respect to x to get 

!_) = >)_ 


The right-hand side is easy: it’s just 1. To find the left-hand side, we use 
implicit differentiation (see Chapter 8). If we set u = /(y), then by the chain 
rule (noting that du/dy = /’("))，we have 


= /'( 4 . 


Now divide both sides by f\y) to get the following principle: 


ify= 厂 1 ㈤ ， then % = m - 


If you want to express everything in terms of x, then you have to replace y 
by f~ x {x) to get 


i^ = nF^r 


◎ 


In words, this means that the derivative of the inverse is basically the recipro¬ 
cal of the derivative of the original function, except that you have to evaluate 
this latter derivative at instead of x. 

For example, set f(pc) = ^x 3 — x 2 5x — 11. We saw in Section 10.1.1 
above that / has an inverse on all of 1R. If we set y = / _1 (a;), then what is 
dy/dx in general? What is its value when x = —11? To do the first part, all 
you have to do is to see that f r {x) = x 2 — 2x b, so 

dy_ _ 1 — 1 

dx ~ f’[y) ~ y 2 -2y-\-5 

Note that it’s important to replace x by y here. Anyway, now we can solve 
the second part. We know that x = —11, but what is yl Since y = 
we know that f(y) = x. By the definition of /, we have 


■^y 3 -y 2 + 5y- 11 = -11. 


Now clearly y = 0 is a solution to this equation, and it must be the only 
solution because the inverse exists. So, when x = —11, we have y = 0, and 
then 

dy _ 1 _ 1 — 1 

= y 2 - 2y + 5 = (0) 2 - 2(0) + 5 = 5* 

More formally, one can write (/ 一 ”’( 一 11) = 1/5. 
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Now suppose that h(x) = x s as in Section 10.1.1 above. We saw there 
that h has an inverse, and we even have a way to write it: h~ 1 (x) = x 1 ^ 3 . Of 
course, we could just use the rule for differentiating x a with respect to x, but 
let’s try the above method. We know that h f (x) = 3x 2 ; if y = h~ 1 (x), then 


dy_ = 1 = 丄 

dx h f (y) 3y 2 ' 

Now we can solve the equation x = y 3 for y to get y = x 1 / 3 , and substitute 
into the above equation to get 

dy _ 1 1 

▲ - 30 c 1 / 3 ) 2 _ 3 ^ 73 * 

This is all pretty silly, because we could just have differentiated y = x 1 ^ 3 and 
gotten the same answer without nearly so much work. Nevertheless it’s nice 
to know that it all works out. 

Before we move on to another example, let’s just note that the derivative 
of the inverse function doesn’t exist when x = 0, since the denominator 3x 2 / 3 
vanishes. So even though the original function is differentiable everywhere, the 
inverse isn’t differentiable everywhere: its derivative doesn’t exist at x = 0. 
This is true in general, not just for the function h from above. If you have 
any function which has an inverse, and it has slope 0 at the point (x,y), the 
inverse function will have infinite slope at the point (y,x), as the following 
picture illustrates: 




Sometimes you don’t know much about a function, but you can still find 
out something about the derivative of the inverse function. For example, 
suppose you know that g(x) = sin(/ _1 (a:)) for some invertible function /， but 
all you know about / is that /(7r) =2 and /'(7T) = 5. That’s actually enough 
information to find the values of g(2) and g f (2). In particular, since /(7r) = 2 
and / is invertible, we have / _1 (2) = 7r, so g(2) = sin(/ _1 (2)) = sin(7r) = 0. 
Also, by the chain rule and the above boxed formula for (Z -1 ) 7 ^), we have 


9'{x) = cos(/ _1 (a;)) x = cosif-^x)) x 
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Putting x = 2 and using the facts that / _1 (2) = n and /’(7T) = 5, we get 

5 '⑶ =cos(/- 1 (2)) x //(/ _\ (2)) = oos(n) x ▲ = _1 x 丢 = 

Make sure you know both the above versions of the formula for the derivative 
of an inverse function! 


Ip. 14. ：A big example 



Let’s finish off with an example that involves most of the theory we’ve looked 
at so far in this chapter. Suppose that 


f(x) = x 2 (x — 5) 3 on the domain [2, oo). 


Here’s what we want to do: 


1. show that / is invertible; 

2. find the domain and range of the inverse / _1 ; 

3. check that /(4) = —16; and finally, 

4. compute (Z -1 )^—16). 

For #1, use the product rule and the chain rule to see that 
f(x) = 2x(x — 5) 3 + 3x 2 (x — 5) 2 . 

Noticing that x and (x — 5) 2 are factors of both terms on the right, we can 
rewrite this as 


f(x) = x(x — b) 2 (2(x — 5) + 3a:) = x(x — 5) 2 (5a: — 10) = 5x(x — 5) 2 (x — 2). 

When x > 2 (remember, the domain of / is [2,oo)), all three of the factors 
5x, (x — 5) 2 ，and (x — 2) are nonnegative, so their product is as well. We 
have now shown that f f (x) > 0 on (2, oo). Also, the only place in this domain 
where f(x) = 0 is x = 5. Since / is continuous on [2, oo), the methods of 
Section 10.1.1 above show that / has an inverse. 

Let’s move on to #2. The range of the inverse / 一 1 is just the domain 
of /, which of course is [2, oo). Alas, the domain of / 一 1 is harder to find. 
Indeed, the domain of / 一 1 is precisely the range of /, so we need to do some 
work and find this range. It，s not such a big deal, though. We know that / 
is always increasing，so this means that /(2) is the lowest point. That is, the 
function starts at height /(2), which works out to be 2 2 (—3) 3 = —108, and 
increases. How high does it get? Well, as x gets larger and larger, / does 
as well —— there’s no limit to how much it increases. This means that / covers 
all the numbers from —108 upward, so the domain of / 一 1 is the same as the 
range of /, which is [—108, oo). 

We still have to do the last two parts of the problem. For #3, it’s an 
easy calculation to show that /(4) = —16, which means that / _1 (—16) = 4. 
Moving on to #4, iiy = / _1 (a:), then we know that 

dy_ _ 1 — _1_ 

dx — f'{y) ~ 5y(y- 5) 2 (y _ 2) _ 
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10.2 Inverse Trig Functions 

Now it’s time to investigate the inverse trig functions. We’ll see how to define 
them, what their graphs look like, and how to differentiate them. Let’s look 
at them one at a time, beginning with inverse sine. 

10.2.1 Inverse sine 

Let’s start by looking at the graph oiy = sin (a:) once again: 



Does the sine function have an inverse? You can see from the above graph 
that the horizontal line test fails pretty miserably. In fact, every horizontal 
line of height between —1 and 1 intersects the graph infinitely many times, 
which is a lot more than the zero or one time we can tolerate. So, using 
the tactic described in Section 1.2.3 in Chapter 1, we throw away as little of 
the domain as possible in order to pass the horizontal line test. There are 
many options, but the sensible one is to restrict the domain to the interval 
[—7r/2,7r/2]. Here’s the effect of this: 



y = sin(x), — f < x < § 

- 孥 .'二 2tt - 警 

0 f 誓 ...h 孥 3tt. 


The solid portion of the curve is all we have left after we restrict the domain. 
Clearly we can’t go to the right of 7 t/ 2 or else we’ll start repeating the values 
immediately to the left of 7r/2 as the curve dips back down. A similar thing 
happens at —n/2. So, we’re stuck with our interval. 

OK, if f(x) = sin(a:) with domain [—7r/2,7 t/ 2], then it satisfies the hor¬ 
izontal line test, so it has an inverse f - 1 . We’ll write / - 1 ⑷ as sin -1 (a:) 
or arcsin (: c). (Beware: the first of these notations is a little confusing at 
first, since sin -1 (a:) does not mean the same thing as (sin ⑷) - \ even though 
sin 2 (a:) = (sin (: r)) 2 and sin 3 ($) = (sin(a:)) 3 .) 


So, what is the domain of the inverse sine function? Well, since the range of 
f(x) = sin (: c) is [—1,1], the domain of the inverse function is [—1,1]. And since 
the domain of our function / is [—7r/2, tt/2] (since that’s how we restricted 
the domain), the range of the inverse is [—7r/2, 7 t/2]. 

How about the graph oiy = sin -1 (a:)? We just have to take the restricted 
graph of y = sin(x) and reflect it in the mirror line y = X] it looks like this: 
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Here’s a neat way to remember how to draw this graph. Start by reflecting 
all of 2/ = sin(x) in the line y = x, then throw away all but the correct part 
of it. This graph shows how the above graph of y = sin -1 ⑷ is just part of 
the tipped-over graph of y = sin(a:): 



Note that since sin(x) is an odd function of x, so is sin— 1 (a:). This is consistent 
with the above graphs. 

Now let’s differentiate the inverse sine function. Set y = sin -1 (a;); we want 
to find dy/dx. The snazziest way to do this is to write x = sin(y) and then 
differentiate both sides implicitly with respect to x: 




The left-hand side is just 1, but the right-hand side needs the chain rule. You 
should check that you get cos(y)(dy/dx). So we have 



cos(y) 


dy_ 

dx 
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which simplifies to 


cos(y) * 


Actually, we could have written this down immediately using the formula from 
Section 10.1.3 above. Now, we really want the derivative in terms of x, not 
y. No problem —— we know that sin(y) = x, so it shouldn’t be too hard to find 
cos("). In fact, cos 2 (y) + sin 2 (y) = 1, which means that cos 2 (y) x 2 = 1. 
This leads to the equation cos(y) = ±Vl — x 2 , so we have 

dy_ _+ 1 

dx \J\ — x 1 

But which is it? Plus or minus? If you look at the graph of y = sin -1 ⑷ 
above, you can see that the slope is always positive. This means that we have 
to take the positive square root: 


« . -1/ X 丄 

^ sin ( " ) = 7CT 


for — 1 < x < 


◎ 


Note that sin -1 ⑷ is not differentiable, even in the one-sided sense, at the 
endpoints x = 1 and x 麵 一 1, since the denominator y/l — x 2 is 0 in both 
these cases. 

In addition to the derivative formula and the above graph, here’s a sum¬ 
mary of the important facts about the inverse sine function: 

sin— 1 is odd; it has domain [—1,1] and range [—f ? f ]• 

Now that you have a new derivative formula, you should become comfort¬ 
able using the product, quotient, and chain rules in association with it. For 
example, what are 

(sin -1 (7a:)) and ^-(x sin -1 (a: 3 ))? 

For the first one, you could use the chain rule, setting t = 7x, or you could use 
the principle from the end of Section 7.2.1 in Chapter 7: when you replace x 
by ax, you have to multiply the derivative by a. So we have 

(sin -1 (7a:)) = 7 x 工 = = ^ =. 

dx K ,/l- (7x) 2 VI- 49x 2 

For the second question, start by setting y = a: sin -1 (a: 3 ); also put u = x and 
v = sin -1 (a: 3 ), so that y = uv. Well need to use the product rule: 


^-=v^+u^= S in-\ X ^) 
ax ax 


dx 


dv 

x d^- 


To finish it off, we must find dv/dx. Since v = sin -1 (a: 3 ), if we set t = x s 
then v = sin _1 (t). By the chain rule, 


^ ^ 1 2 = 3x 2 

dx dt dx Vl t 2 V ^/1 (x 3 ) 2 


3x 2 


\/l - x 6 
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Plug this into the previous equation to see that 


d^y_ 

dx 


=sin 一 1 (a: 3 ) x 1 + x 


dv 

dx 


=sin -1 (a; 3 ) + 


3x s 

Vl — t 6 ， 


and we’re all done. 


10.2.2 Inverse cosine 

We’re going to repeat the procedure from the previous section in order to 
understand the inverse cosine function. Start with the graph oi y = cos(x): 



Once again, no inverse. This time, restricting the domain to [—7r/2, 7 t/ 2] won’t 
work, since the horizontal line test would fail and also we’d be throwing away 
part of the range that would be useful. Already on the above graph, you can 
see that the section between [0,7r] is highlighted and obeys the horizontal line 
test, so that’s what we’ll use. We get an inverse function which we write as 
cos— 1 or arccos. Like inverse sine, the domain of inverse cosine is [—1,1]，since 
that’s the range of cosine. On the other hand, the range of inverse cosine is 
[0,7r], since that’s the restricted domain of cosine that we’re using. The graph 
oi y = cos 一 1 (a:) is formed by reflecting the graph oi y = cos(a:) in the mirror 
y = x ： 



Notice that the graph shows that cos 一 1 is neither even nor odd. This is despite 
the fact that cos (: r) is an even function of x\ In any case, if you have trouble 
drawing the above graph from memory, just draw the graph of cos(a:) on its 
side and pick out the bit with range [0, 丌 ]， like this: 
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Now it’s time to differentiate y = cos~ 1 (x) with respect to x. We do exactly 
the same thing we did in the previous section. Start by writing x = cos(y) 
and differentiating implicitly with respect to x: 

i ix) = l (cos(2/)) - 

The left-hand side is 1 and the right-hand side is — sin(y)(dy/dx). This can 
be rearranged into 

dy_ _ 1 

dx sin(y) 

Since cos 2 (y) + sin 2 (y) = 1, and also x = cos(y), we have sin(y) = ±Vl — x 2 . 
This means that 

d V _ 1 — 丄 1 

dx ±Vl — x 2 y/1 — x 2 

Unlike the case of inverse sine, the graph of inverse cosine is all downhill, 
which means that the slope is always negative, so we get 

cos _1 (:r) = - , for - 1 < a: < 1. 

dx w vT ^2 


Here are the other facts about inverse cosine that we collected above: 


I cos— 1 is neither even nor odd; it has domain [—1,1] and range [0,7r]. | 


Before we move on to the inverse tangent function, let’s just look at the 
derivatives of inverse sine and inverse cosine side by side: 

sin -1 (a:) = / 1 and cos - 工 ⑷ = —— t . 

dx w v / T ^2 dx w VT^ 

The derivatives are negatives of each other! Let’s try to see why this makes 
sense. If you plot y = sin— 1 (a:) and y = cos -1 (a:) on the same set of axes, 
here’s what you get: 
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The two mountain-climbers in the above picture experience exactly opposite 
conditions at the same horizontal point, so it makes sense that the derivatives 
should be negatives of each other. Indeed, we now know that 

(sin - i(;r) + cos _1 (x)) = . 1 - . =0. 

dx K v ; VT^ 

So y = sin -1 (a:) + cos -1 (a:) has constant slope 0, which means that it’s flat 
as a pancake. In fact, if you add up the heights of the function values in the 
two graphs above, you can see that you get 丌 /2 for any value of x. We’ve just 
used calculus to prove the following identity: 

sin -1 (a:) + cos -1 (a:) = 

for any x in the interval [—1,1]. When you think about it, this makes sense, 
though! Look at the following diagram: 



Since sin(a) = x, we have a = sin -1 (a:). Similarly, cos(/?) = x which means 
that (3 = cos -1 (a:). But a + /? = 7 t/ 2, which means that 

sin - 1($) + cos - 1 ($)= 吾 

once again. Kind of nice how the calculus agrees with the geometry, huh? 

10.2.3 Inverse tangent 

Here we go again. Let’s remember the graph of y = tan(x): 
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Now let’s differentiate y = tan _1 (a;) with respect to x. Write x = tan(y) and 
differentiate implicitly with respect to x. Check to make sure that you believe 
that 

dy_ = 1 

dx sec 2 (y) * 


Since sec 2 (y) = 1 + tan 2 (y), and tan(y) = x, we see that sec 2 (y) 
This means that 




d i , 、 1 


X tan _ ⑻ =TT^ 

for all real x. 


We also have the following facts from above: 


I tan — i is odd; it has domain R and range (_f, f). | 


Unlike inverse sine and inverse cosine, the inverse tangent function has hori¬ 
zontal asymptotes. (The first two functions don’t have a chance, since their 
domains are both [—1,1].) As you can see from the graph above, tan 一 1 ⑷ 
tends to 丌 /2 as a: —> oo, and it tends to —7r/2 as a; —> —oo. In fact, the verti¬ 
cal asymptotes x = 7r/2 and x = —7t/ 2 of the tangent function have become 
horizontal asymptotes of the inverse tan function. This means that we have 
the following useful limits: 


lim tan -1 (a:) = 

and 

lim ta 

n- 1 ㈤ 一 



x^—oo 

2 


By the way, we’ve seen these limits before, in Section 3.5 of Chapter 3. In 
any case, these limits can come up in conjunction with other limits at 士 oo; 
for example, to find 


lim 


ar 2 — 6a: + 4 


—oo (2a: 2 -\-7x — S) tan -1 (3a:) ? 

first separate the fraction to get 

v x 2 -6x-^ 4 ： 1 

lim —— - x - r- —— 

a-^-oo 2a; 2 + 7a; — 8 tan _1 (3x) 


The first fraction has limit 1/2 (check it!), but what happens to the second 
fraction? Well, as x becomes very negatively large, 3x also does, so tan -1 (3a;) 
tends to — 丌 /2. So the whole limit is 



7T 


However, suppose that we replace the 3x term by 3a: 2 , like this: 


lim x 2 -6x-\-4 ： 

(2a; 2 + 7a; — 8) tan _1 (3x 2 ) * 

Now tan -1 (3a; 2 ) has limit n/2 even when x —> —oo, because then 3a: 2 tends 
to oo, not —oo. So the overall limit in this case is 1/tt. 
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Make sure you see why this leads to 

dy_ = 1 

dx sec(y) tan(y) * 

Now x = sec(y), so since sec 2 (y) = 1 + tan 2 (y), we can rearrange and take 
square roots to show that tan(y) = 士 V^ 2 — 1. This means that 

dy_ _ 1 

dx ~ ±xVx 2 - 1. 

Is it plus or minus? Looking at the graph of y = sec~ 1 (x) above, you can 
see that the slope is always positive. So in fact we need to be a little more 
clever — instead of the plus or minus, we can simply put | 尤 | instead of x and 
we always get something positive. That is, 



We can summarize the other facts about inverse secant like this: 



(Here I used the standard abbreviations of U to mean the union of two inter¬ 
vals, and \ to mean “not including .”） 

10.2.5 Inverse cosecant and inverse cotangent 


Let’s just wrap the last two inverse trig functions up quickly. You can repeat 
the above analyses to find the domain, range, and graphs of y = esc -1 (a:) and 
y = cot -1 (a:): 



This is what the graphs look like: 



Both functions have horizontal asymptotes: y = esc -1 (a;) has a two-sided 
horizontal asymptote at y = 0, and y = cot -1 (a:) has a left-hand horizontal 
asymptote at y = 7r and a right-hand one at y = 0. We can summarize the 
limits as follows: 
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and 

and 

Of course, if you know the above graphs, you can reconstruct the limits with¬ 
out having to remember them. Notice that the graphs oi y = esc -1 (a:) and 
y = sec 一 1 (a:) from above are very similar; in fact, you can get one from the 
other by flipping about the line y = 7r/4. This is exactly the same relation as 
the one that y = sin -1 (a;) and y = cos -1 (a;) have with each other. So it’s not 
surprising that the derivative of esc 一 1 ⑷ is just the negative of the derivative 
of sec -1 (a:): 



The same thing happens with cot -1 (a;) and tan - 1 ⑷， so that 



10.2.6 Computing inverse trig functions 

We’ve completed a pretty thorough survey of the inverse trig functions. Since 
you have a few more derivative rules, it’s a great idea to practice differentiating 
functions involving inverse trig functions. Meanwhile, let’s not neglect some 
basic computations involving inverse trig functions which don’t involve any 
calculus. For one thing, you should try to make sure that you can compute 
quantities like sin _1 (l/2), cos _1 (l), and tan _1 (l) without stretching your 
brain. For example, to find sin— 1 (1/2), remember that you’re looking for an 
angle in [― 冗 /2, 丌 /2] whose sine is 1/2. Of course —— it’s 7r/6. Similarly, it should 
be almost second nature to write down cos _1 (l) = 0 and tan _1 (l) = 7r/4. All 
the common values are in the table near the beginning of Chapter 2. 

Now, here’s a more interesting question: how would you simplify 

sin - 1 ^sin 

The knee-jerk reaction is to cancel out the inverse sine and the sine, leaving 
only 137 t/ 10. This can’t be correct, though — the range of inverse sine is 
[—7r/2,7r/2], as we saw in Section 10.2.1 above. What we really need to do 
is find an angle in that range which has the same sine as 13 丌 /10. Well, note 
that 13 丌 /10 is in the third quadrant, since it’s greater than n but less than 
37t/2, so its sine is negative. Furthermore, the reference angle is 37t/10. The 
possible angles in the range [ 丌 /2, 丌 /2] with the same reference angle are 37r/10 
and —37r/10. The first one has a positive sine, while the second has a negative 
sine. We need a negative sine, so we’ve proved that 
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Now, how about finding 


卞⑵ > 


The previous answer — 3 丌 /10 can’t be correct here, since the range of inverse 
cosine is [0,7r]. Man, why does this stuff have to be so messy? Nothing I 
can do about it, unfortunately ... so let’s deal with it like this: once again, 
137 t/ 10 is in the third quadrant, so its cosine is negative. The reference angle 
is 3 丌 /10; the only angles in [0, tt] with the same reference angle are 37 t/ 10 and 
77 t/ 10. The cosines of these two angles are positive and negative, respectively; 
since we want a negative cosine, we must have 


cos " 1 ( cos ( i i [ )) = ^- 

I now leave it to you to show that 

tan- 1 (tan ( 寄 ))= 磊 . 


Just remember that tan is positive in the third quadrant! In any case, those 
are all difficult examples, so I wouldn’t blame you if you also thought that 
finding 

sin (sin- 1 (—*)) 

would be hard as well. Luckily, it’s not: the answer is just —1/5. In general, 
sin(sin _1 (a:)) = provided that x is in the domain [—1,1] of inverse sine. 
(Otherwise, sin(sin _1 (a;)) doesn’t even make sense!) The trouble comes when 
you try to write sin _1 (sin(a:)) = x. This just isn’t true, as the above example 
where x = 137t/ 10 shows. Of course, the same observations apply to all the 
other inverse trig functions. (See also the discussion at the end of Section 1.2 
in Chapter 1.) 

Two more examples: consider how you would find 



The trick in both cases is to use the trig identity cos 2 (a:) + sin 2 (a:) = 1. For 
the first problem, let 

* = cos_1 (孕） 

and note that we want to find sin(a:). We actually know cos(x): 


_ ) = c + -调) 


VT5 

丁 . 


Remember, there’s no problem taking the cosine of an inverse cosine: it’s only 
the other way around that poses a problem. Anyway, we know cos(x), so by 
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rearranging the identity cos 2 ⑷ + sin 2 ⑷ =1, we must have 


sin(a:) = 士 ^ /l — cos 2 (x) = 士 




So the answer we want is either 1/4 or —1/4. Which one is it? Well, since 
VT5/4 is positive, inverse cosine of it must lie in [0,7r/2]. That is, x is in the 
first quadrant, so its sine is positive. We’ve finally shown that 


sin cos' 


(宇 ))4 


As for 


-挪 


…、一 、 4 J J 

you can repeat the above argument to show that 


sin(a:) = — cos 2 (x) = 


-\[y6 


You might guess that the answer this time is —1/4, but that’s no good. You 
see, —V^15/4 is negative, so its inverse cosine must lie in the interval [ 丌 /2, 冗 ]. 
That is, x is in the second quadrant. The thing is, sine is positive in the 
second quadrant as well! So sin(x) must be positive, and we’ve shown that 



as well. In fact, we’ve noticed that sin(cos _1 (A)) must always be nonnegative, 
even if A is negative (note that A has to lie in [—1 ， 1]，since that’s the domain 
of inverse cosine). This is because cos -1 (A) is in the interval [0 ，丌 ], and sine 
is nonnegative on that interval. 

We’ll actually look at another method of finding things like sin(cos _1 (A)) 
when we see how to do trig substitutions in Section 19.3 of Chapter 19. For 
now, let’s take a well-deserved rest from inverse trig functions and take a quick 
look at inverse hyperbolic functions. 


10.3 Inwse Hyperbolic Functions 

The situation is a little different for hyperbolic functions, which we looked at 
in Section 9.7 of the previous chapter. Look back now and remind yourself 
what the graphs of these functions look like. In particular, you can see that 
the graph of y = cosh(ar) is sort of like the graph of y = x 2 , except shifted up 
by 1 and shaped a little differently. If you want an inverse for this function, 
you have to throw away the left half of the graph, just as you do when you 
take the positive square root (and throw away the negative one). On the other 
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hand, y = sinh ⑷ already satisfies the horizontal line test, so there’s nothing 
that needs to be done. So we get two inverse functions with the following 
properties: 

cosh 一 1 is neither odd nor even; it has domain [1, oo) and range [0, oo). 

sinh -1 is odd; its domain and range are all of E. 

The graphs are obtained by reflecting the original graphs in the line y = x as 
usual: 


I 

y = cosh - (x) 



y = sinh -1 (a;) 


The derivatives are obtained in the same way that we got the derivatives of 
the inverse trig functions. In particular, if y = cosh -1 (a:), then x = cosh(y); 
differentiating implicitly with respect to x, we get 

1 = sinh ⑼室 . 

(Remember that the derivative of cosh(a:) with respect to x is sinh(a:), not 
— sinh (: r).) Now cosh 2 (y) — sinh 2 (y) = 1, so we can rearrange and take square 
roots to see that sinh(y) = ±y cosh 2 (y) — 1 = — 1- Since cosh -1 (a:) is 

clearly increasing in x, we end up with 


—cosh— 1 (a:) = , for a: > 1. 

dx " 


In exactly the same way, you should be able to check that 


sinh _1 (x) = , } for all real x. 

dx w v^TT 


Now, let’s forget about the calculus for a few seconds and recall the definitions 
of cosh(a:) and sinh(x): 

+ e -x e x _ e -x 

cosh(a:) = --- and sinh(a:) = --- • 

Since we can write cosh(x) and sinh(o;) in terms of exponentials, we should be 
able to write the inverse functions in terms of logarithms. After all, exponen¬ 
tials and logarithms are inverses of each other. Let’s see how it works. For 
example, if y = cosh -1 (a:), then x = cosh(y) = (e y + e~ y )/2. Now you can 
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solve for y by using a little trick. Let u = e y •’ then e~ y = 1/u. The equation 
then looks like this: 

u + 1/u 
― 

Multiply both sides by 2u and rearrange; we get a quadratic equation in w, 
which is u 2 — 2xu + 1 = 0. By the quadratic formula, 

e y = u = y/x 2 — 1^ 

so taking logs of both sides, 


y = \n(x± y/x 2 - 1). 

Well, is it plus or minus? After a bit of gymnastics, you can actually see that 
x — \/x 2 — 1 < 1 if $ > 1. This means that the logarithm of it is negative 
(remember, the log of a number between 0 and 1 is negative!). That’s not 
what we want. So it’s the positive square root, and we just showed that 

cosh -1 (a:) = ln(a; + ^/x 2 — \) 

when a: > 1. In a similar way, you can show that 

sinh _1 (a:) = ln(a: + \/x 2 -\-l) 

w^\ for all x. As an exercise, you should try differentiating the right-hand sides of 
these last two equations and check that your answers agree with the derivatives 
of cosh -1 (a:) and sinh 一 1 ⑷ we found above. 

10.3.1 The rest of the inverse hyperbolic functions 

So far, we’ve only looked at hyperbolic sine and cosine. If you repeat the 
analysis for the other four hyperbolic functions, you should be able to conclude 
that: 


tanh 一 1 is odd; its domain is (—1 ， 1); its range is all of R. 


sech 一 1 is neither even nor odd; its domain is (0,1]; its range is [0, oo). 


csch— 1 is odd; its domain and range are both R\{0}. 


coth— 1 is odd; its domain is (—oo, —1) U (1, oo); its range is R\{0}. 

Note that we’ve restricted the domain of sech to [0,oo) in order to get an 
inverse, just as we did for cosh. 

Now, here are the graphs, which you should compare with the graphs of 
the original (non-inverse) functions in Section 9.7 of the previous chapter: 










CHAPTER 11 一 

The Derivative and Graphs: 


We have seen how to differentiate functions from several different families: 
polynomials and poly-type functions, trig and inverse trig functions, expo¬ 
nentials and logs, and even hyperbolic functions and their inverses. Now we 
can use this knowledge to help us sketch graphs of functions in general. We’ll 
see how the derivative helps us understand the maxima and minima of func¬ 
tions, and how the second derivative helps us to understand the so-called 
concavity of functions. All in all, we have the following agenda: 

• global and local maxima and minima (that is, extrema) of functions, 
and how to find them using the derivative; 

• Rolle’s Theorem and the Mean Value Theorem, and their implications 
for sketching graphs; 

• the graphical interpretation of the second derivative; and 

• classifying points where the derivative vanishes. 

Then in the next chapter, we’ll look a comprehensive method of sketching 
graphs of functions using the above methods. 


11.1 Extrema of Fune 姆 ns 


If we say that a: = a is an extremum of a function /, this means that / has 
a maximum or minimum at x = a. (The plural of “extremum” is “extrema,” 
of course.) We’ve already looked a little bit at maxima and minima in Sec¬ 
tion 5.1.6 of Chapter 5; I strongly suggest taking a peek back at that before 
you read on. In any event, we need to go a little deeper and distinguish 
between two types of extrema: global and local. 

11.1.1: Slobol iDoci) ©xtremo ： 

The basic idea of a maximum is that it occurs when the function value is 
highest. Think about where the maximum of the following function on its 
domain [0,7] should be: 
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Certainly the maximum value that this function gets to is 3, which occurs 
when x = 0, so it’s true that the function has a maximum at a; = 0. On the 
other hand, imagine the graph is a hill (in cross-section) and you’re climbing 
up it. Suppose you start at the point (2, —1) and walk up the hill to the right. 
Eventually you reach the peak at (5,2), and then you start going back down 
again. It sure feels as if the peak is some sort of maximum — it’s the top of 
the mountain, at height 2, even though there’s a neighboring peak to the left 
that’s taller. If the high ground near $ = 0 were covered in fog, you couldn’t 
even see it when you climbed the peak at (5,2), so you’d really feel as if you 
were at a maximum. In fact, if we restrict the domain to [2,7], then the point 
a: = 5 is actually a maximum. 

We need a way of clarifying the situation. Let’s say that a global maximum 
(or absolute maximum) occurs at a: = a if /(a) is the highest value of / on 
the entire domain of /. In symbols, we want f(a) > f(x) for any value x 
in the domain of /. This is exactly the same definition we used before when 
we looked at maxima in general; we’re simply being more precise and saying 
“global maxima” instead of just “maxima.” 

As we noted before, there could be multiple global maxima; for example, 
cos ⑷ has a maximum value of 1, but this occurs for infinitely many values 
of x. (These values are all the integer multiples of 2 丌 , as you can see from 
the graph oi y = cos(x).) 

How about that other type of maximum? Let’s say that a local maximum 
(or relative maximum) occurs at x = a if /(a) is the highest value of / on 
some small interval containing a. You can think of this as throwing away 
most of the domain, just concentrating on values of x close to a, then insisting 
that the function is at its maximum out of only those values. 

Let’s see how this works in the case of our above graph. We see that 
a: = 5 is a local maximum, since (5,2) is the highest point around if you only 
concentrate on the function near x = 5. For example, if you cover up the part 
of the graph to the left of a: = 3, then the point (5,2) is the highest point 
remaining. On the other hand, x = 5 isn’t a global maximum, since the point 
(0,3) is higher up. This means that a: = 0 is a global maximum. It’s also a 
local maximum; in fact, it’s pretty obvious that every global maximum is 
also a local maximum. 

In the same way, we can define global and local minima. In the above 
graph, you can see that a: = 2 is a global minimum (with value —1 )， since the 
height is at its lowest. On the other hand, a: = 7 is actually a local minimum 
(with value 0). Indeed, if you just look at the function to the right of x = 5, 
you can see that the lowest height occurs at the endpoint x = 7. 











val. We also saw that it the runction isn t continuous, or even it it is continuous 
but the domain isn’t a closed interval, then there might not be a global max¬ 
imum or minimum. For example, the function / given by f(x) = 1/x on 
the domain [—1,1]\{0} doesn’t have a global maximum or minimum on that 
domain. (Draw it and see why!) 

The problem with the Max-Min Theorem is that it doesn’t tell you any¬ 
thing about where these global maxima and minima are. That’s where the 
derivative comes in. Let’s say that x = cis a, critical point for the function f 
if either /’(c) = 0 or if / 7 (c) does not exist. Then we have this nice result:* 

Extreme Value Theorem: suppose that / is defined on (a, b) 
and c is in (a, b). If c is a local maximum or minimum of /, then 
c must be a critical point for /. That is, either /’(c) = 0 or /’(c) 
does not exist._ 

So local maxima and minima in an open interval occur only at critical points. 
But it’s not true that a critical point must be a local maximum or minimum! 
For example, if f(x) = a: 3 , then f’(x) = 3x 2 , and you can see that f f (0) = 0. 
This means that x = 0 is a critical point for /. On the other hand, x = 0 is 
neither a local maximum nor a local minimum, as you can see by drawing the 
graph of y = X s . 

The above theorem applies to open intervals. How about when the domain 
of your function is a closed interval [a, 6]? Then the endpoints a and b might 
be local maxima and minima; they aren’t covered by the theorem. So in the 
case of a closed interval, local maxima and minima can occur only at critical 
points or at the endpoints of the interval. For example, let’s take a closer 
look at our graph from the previous section: 


(5,2) 



(7,0) 


jid 丨 = 5, while the local i 
5 and x = 2 are critical 
: s a: = 0 and x = 7 are end 
;theorem makes sense. S 
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immediate left of a: = a, you must be going downhill, so the slope (if it exists) 
is negative. When you are to the immediate right oi x = a, you are going 
uphill, so the slope is positive. If you are to get from a negative to a positive 
slope, you would think that you have to go through 0. On the other hand, 
if f(x) = \x\, then / goes from a slope of —1 to a slope of 1 without passing 
through 0. This is because 尸 (0) doesn’t exist (as we saw in Section 5.2.10 
in Chapter 5). That’s OK, though — the point x = 0 is still a critical point, 
because the derivative doesn’t exist there. It’s also a local minimum. (Can 
you see why?) By the way, the above logic doesn’t constitute a proof of the 
theorem; a real proof is in Section A.6.6 of Appendix A. 


11.1.3 How to find global maxima and minima 



The Extreme Value Theorem really makes finding global extrema pretty easy, 
since it narrows down where they can be. Here’s the idea: every global ex¬ 
tremum is also a local extremum. Local extrema can only occur at critical 
points. So just find all the critical points and look at the corresponding func¬ 
tion values. The biggest one gives the global maximum, while the smallest 
gives the global minimum! In gory detail, here’s how to find the global maxi¬ 
mum and minimum of the function / with domain [a, 6]: 

1. Find Make a list of all the points in (a, 6) where /’Or) does not 

exist or f f (x) = 0. That is, make a list of all the critical points in the 
interval (a, b). 

2. Add the endpoints x = a and x = b to the list. 

3. For each of the points in the list, find the ^-coordinates by substituting 
into the equation y = f(x). 

4. Pick the highest y-coordinate and note all the values of x from the list 
corresponding to that ^-coordinate. These are the global maxima. 

5. Do the same for the lowest ^-coordinate to find the global minima. 


We’ll worry about local extrema in Section 11.5 below. For now, let’s look at 
an example of how to apply this method. Suppose that 

f(x) = 12/ + 15a: 4 - 伽 3 + 1 

on the domain [—1,2]. What are the global maxima and minima of / on this 
domain? 

Let’s follow the above program. For step 1, we need to find /’ ⑷. No 
problem: you should check that f f (x) = 60x 4 + 60a; 3 — 120x 2 . Clearly f f (x) 
exists for all x in (—1 ， 2), so we just need to find all the values of x satisfying 
f(x) = 0. That’s not so bad if you factor f\x) as f(x) = 60x 2 (x-l){x-\-2). 
So we can see that if f f (x) = 0, we must have x = 0^x = lorx = —2. The 
last of these is irrelevant since —2 is not in the interval (—1,2). So our list 
just contains x = 0 and x = 1. Step 2 tells us to add the endpoints x = —1 
and x = 2 to the list. 

So, we arrive at step 3 armed with the following list of candidates for global 
maxima and minima: —1, 0, 1， and 2. We need to find the corresponding 
function values. This is just a matter of plugging them in and calculating 
that /(—1) = 44, /(0) = 1 ，/⑴ =—12, and /(2) = 305. As for the last two 
steps, all we have to do is select the highest and lowest values from this list. 
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The highest is 305, which occurs when a: = 2, so x = 2 is a global maximum 
for /. The lowest function value is —12, which occurs when x = 1, so a; = 1 
is a global minimum for /, and we’re all done! 

Before we start lounging around after our efforts, let’s take a closer look 
at the function /. First, note that if we made the domain larger, the situation 
could change for two reasons: the new endpoints would be different, and also 
the critical point at x = —2 could come into play. Second, we should look at 
what happens at the critical point x = 0 a little more closely. Is this a local 
maximum, a local minimum, or neither? One way to tell is to inspect the 
graph, which must look something like this: 



The point (—1,44) is higher than (0,1), which is in turn higher than (1, —12). 
So we can’t possibly have a local maximum or a local minimum at 0. But 
wait, you say — perhaps the graph looks something like this: 



In this picture, a: = 0 is a local maximum. The problem is that we’ve had 
to introduce another local minimum somewhere between —1 and 0. After all, 
if the curve is supposed to get from (—1,44) to (0,1) while still being on a 
plateau at (0,1) ， it’s got to go down below a height of 1. This means there 
has to be a valley as well，which means a local minimum somewhere between 
x = —1 and a: = 0! That can’t happen, though, since there are no critical 
points between x = —1 and x = 0. So the graph must look more like the first 
picture above, and the conclusion is that x = 0 is neither a local maximum 
nor a local minimum. 
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If the domain isn’t bounded, then the situation is a little more complicated. 
For example, consider the two functions / and 仏 both with domain [0, oo), 
whose graphs look like this: 



(2,3) 

7 . 2, 

(2,3) 


1 






In both cases, x = 2 is obviously a critical point, while the endpoints are 0 
and oo. Wait a second, oo isn’t really an endpoint, since it doesn’t really 
exist! Let’s add it to the list anyway, so that the list is 0, 2, and oo; note that 
the same list works for both / and g. 

Let’s take a look at / first. We see that /(0) = 0, /(2) = 3, while /(oo) 
only makes sense if you think of it as 

lim f(x). 

x—^oo 

This limit is 1, since y = 1 is a horizontal asymptote for /. The highest 
of these function values is 3, which occurs at x = 2, so a; = 2 is a global 
maximum for f. The lowest function value is at a; = 0, so a: = 0 is a global 
minimum for /. The right-hand “endpoint” at oo doesn’t even come into it. 

How about gl Well, this time 沒 (0) = 2, g(2) = 3, and the right-hand 
endpoint is covered by the observation that 

lim g(x) = 1. 

The highest value is still 3, which occurs at a: = 2, so x = 2 is also a global 
maximum for g. How about the lowest value? Well, that value, which is 
1, occurs as a: — oo. Does this mean that oo is a global minimum for gl 
Of course not, because oo isn’t even a number; the function g has no global 
minimum.* 

11.2 fefe's Theorem 

Imagine you’re driving down a long straight highway. I watch you stop at a 
gas station. Then you proceed, always facing the same direction, although you 
can put the car in reverse if you want. Later on, I see you at the gas station 
again, without watching what you did in the meantime. I make the following 
conclusion: at some point when I wasn’t looking, your car had velocity equal 
to zero. 


*On the other hand, g does have a global infimum. This concept is a little beyond our 
scope, though. Check out a book on real analysis if you want to learn more. 
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How can I be so confident about this? Well, it’s possible that you never 
even left the gas station, in which case your velocity was zero the whole 
time. If you did leave the gas station and went forward, well, you must have 
eventually have gone backward or else you wouldn’t be back at the gas station 
again. So what happened when you ceased going forward and started going 
backward? You must have stopped, even for an instant! You can’t just change 
from going forward to backward without coming to rest. It’s similar to the 
situation we saw in Section 6.4.1 of Chapter 6 when we studied the motion of 
a ball being thrown up in the air. At the instant the ball reaches the top of 
its path, its velocity is 0. 

On the other hand, you might actually have started backing up from the 
gas station. In that case, you would have switched some time from backward 
to forward motion, and the effect would be the same: you still stopped some¬ 
where. Regardless of which way you set out, you might have stopped many 
times; but I know you stopped at least once. This is the content of Rolle’s 
Theorem,* which says: 

Rolle’s Theorem: suppose that / is continuous on [a, 6] 
and differentiable on (a, 6). If /(a) = f(b), then there must 
be at least one number c in (a, b) such that /’(c) = 0. 

In terms of your journey, we are supposing that f(t) is the position of your car 
at time t. This means that f f (t) is your velocity at time t. The times a and b 
are when I observed you at the gas station; the equation /(a) = f(b) means 
that you were in the same place at time a as at time 6, which of course was the 
gas station. Finally, the number c is a time that you stopped, since /'(c) = 0. 
Rolle’s Theorem is telling me that you must have stopped at least once. I 
don’t know when, because I wasn’t watching, but I know it happened. (I am 
assuming that your car’s motion is differentiable, which is pretty reasonable 
in most circumstances. On the other hand, if you consider the point of view 
of a crash test dummy, perhaps the car’s motion isn’t differentiable at the 
moment the car hits the wall . …） 

Now, let’s look at some pictures of a few possibilities of functions where 
Rolle’s Theorem applies: 



In the first two diagrams, there is only one possible value of c such that 
/’(c) = 0. In the third diagram, there are three potential candidates for c, 
but that’s OK —— Rolled Theorem says that there must be at least one. The 
fourth diagram shows a constant function, so its derivative is always 0. This 
means that c could be any number between a and b. Now, let’s look at some 
pictures where Rolle’s Theorem does not apply: 


*See Section A.6.7 of Appendix A for a proof of Rolle’s Theorem. 
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◎ 

◎ 


In all three cases, the derivative is never 0. That’s OK, because Rolle’s Theo¬ 
rem doesn’t apply in any of these cases. In the first picture, the function isn’t 
differentiable on all of (a, b) because of that spike at s. Yes, even one point 
where the function isn’t differentiable is enough to screw everything up. In 
the middle picture, the function is differentiable, but f(a) ^ f(b), so Rolle’s 
Theorem cannot be used. In the right-hand picture, f(a) = f(b) and the 
function is differentiable on (a, b), but it isn’t continuous on all of [a,b]: the 
point x = a spoils everything. Once again, no Rolle’s Theorem allowed. 

Here’s an example of an application of Rolle’s Theorem. Suppose that 
you have a function / satisfying f f (x) > 0 for all x. In Section 10.1.1 in 
the previous chapter, we claimed that / must satisfy the horizontal line test. 
Let’s prove this using Rolled Theorem, arguing by contradiction. Start off 
by supposing that / does not satisfy the horizontal line test. Then there’s 
some horizontal line, say y = L, which intersects the graph oiy = f(x) twice 
(or more). Suppose that two of these intersection points have ^-coordinates 
a and b. So we know that f(a) = L and f(b) = L. In particular, f(a) = /(&), 
and we can use Rolle’s Theorem (we already know that / is differentiable 
everywhere, so it must be continuous everywhere as well). The theorem says 
that there is some c between a and b such that /’(c) = 0. This is impossible 
because f f (x) is always supposed to be positive! So the horizontal line test 
does not fail. 

Now, let’s look at an even harder example. Suppose now that the second 
derivative of / exists everywhere and that f n {x) > 0 for all real x. The 
problem is to show that / has at most two cc-intercepts. Before we tackle the 
problem itself, let’s just think about what it means for a second or two. Can 
you think of a function / with f ,f (x) > 0 for all x that has no a:-intercepts? 
How about one ^-intercept? Two ^-intercepts? If you can do all these, then 
try to find one with three x-intercepts. Don’t spend too long on this one, 
though, because it’s impossible! Indeed, our problem is to show that you can’t 
have more than two ^-intercepts. 

In fact, here’s the key idea: if there are more than two ^-intercepts, then 
there must be at least three! Let’s suppose that there are more than two; call 
any three of them you like a, 6, and c, where we choose the variables so that 
a < b < c. Since they are all ^-intercepts, we have / ⑷ = f(b) = /(c) = 0. 
So, start off by applying Rolle’s Theorem to the interval [a, b]. Since / is 
continuous and differentiable everywhere, and /(a) = /(6), we know that 
/'(P) = 0 for some p in the interval (a, b). Why do I use p? Because c is 
already taken! 

Now let’s move on to the interval [b,c]. Again, since f(b) = /(c), we 
can use Rolle’s Theorem to show that there must be some number q in (6, c) 
such that f f (q) = 0. Don’t forget that we also have f f (p) = 0. Hey, now 
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we can use Rolle’s Theorem on the interval [p, q], but instead of taking the 
function as /, we’ll use f. After all, we know that f f (p) = f f (q), since both 
of these quantities are 0. So by Rolle’s Theorem, we have some point r where 
(/’)’( r ) = 0. Wait a second, (/’)' is just the second derivative /".So we know 
that f /f (r) = 0 for some r between p and q. This is a big problem because 
we had supposed that f n {x) > 0 for all x. The only way out is that our idea 
that there are more than two ^-intercepts is all out of whack. There can’t be 
more than two, and we’ve solved the problem. 

Tricky stuff. By the way, did you find some functions satisfying f n {x) > 0 
for all x which have 0, 1 and 2 ^-intercepts? If not, check out f(x) = x 2 -\-C, 
where C is positive, zero, or negative, respectively. 

11.3 The Mean Vatu® Ihtoem 

Suppose you go on another journey, and I find out that you have traveled 100 
miles in 2 hours. Your average velocity was 50 miles per hour. This doesn’t 
mean that you were going at exactly 50 miles per hour the whole time. Now, 
here’s my question: were you ever going at 50 miles per hour, even for an 
instant? 

The answer is yes. Even if you go at 45 mph for the first hour and 55 
mph for the second hour, you still have to accelerate from the slow velocity 
to the fast velocity. Along the way, your velocity will pass through 50 mph 
for an instant. You can’t avoid it! No matter how you do your journey, if 
your average velocity is 50 mph, then your instantaneous velocity must be 50 
mph at least once.* Of course, you might be going at 50 mph more than just 
once — there might be several times, or you can even go at 50 mph the whole 
time. This leads to the Mean Value Theorem, which says: 


The Mean Value Theorem: suppose that / is continuous 
on [a, b] and differentiable on (a, b). Then there’s at least one 
number c in (a, b) such that 


f(c)= 


m - f(a) 

b — a 


It seems a little weird, but it actually makes sense. You see, if f(t) is your 
position at time t, and you start and finish at times a and b, respectively, then 
what is your average velocity? The displacement is f(b) — /(a), while the time 
taken is 6 — a, so the quantity on the right-hand side of the above equation 
is just your average velocity. On the other hand, /’(c) is your instantaneous 
velocity at time c. The Mean Value Theorem says that there is at least one 
time c where your instantaneous velocity equals your average velocity over 
the whole journey. 

Let’s look at a picture of the situation. Suppose your function / looks like 
this: 


* Again, all this assumes — very reasonably~that your car’s motion is differentiable! 
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a 


co 


The dashed line joining (a, /(a)) and (6, f(b)) has slope 

m-m 

b — a 



According to the Mean Value Theorem, there is some tangent whose slope 
equals this quantity; that is, some tangent is parallel to the dashed line. In the 
above picture, there are actually two tangents that work~~the ^-coordinates 
are at Co and c \. Either one would be an acceptable candidate for the number 
c in the theorem. 

The Mean Value Theorem looks a lot like Rolle’s Theorem. In fact, the 
conditions for applying the two theorems are almost the same. In both cases, 
your function / has to be continuous on a closed interval [a, 6] and differen¬ 
tiable on (a, b). Rolle’s Theorem also requires that f(a) = /(6), but the Mean 
Value Theorem doesn’t require that. In fact, if you apply the Mean Value The¬ 
orem to a function / satisfying f(a) = f(b), you’ll see that f(b) — / ⑷ = 0, so 
you get a number c in (a, b) satisfying / 7 (c) = 0. So the Mean Value Theorem 
reduces to Rolle’s Theorem! 

Now let’s look at a couple of examples of how to use the theorem. First, 
how would you show that the equation 

2xe x2 — e + 1 = 0 


S has a solution? One way is to use the Intermediate Value Theorem (see 
Section 5.1.4 in Chapter 5) — try it now and see. Suppose instead that I 
,^ give you nudge by suggesting that you apply the Mean Value Theorem to 
f(x) = e x on the domain [0 ， 1]. That’s acceptable because / is continuous 
and differentiable everywhere. The theorem says that there’s a number c in 
[0,1] such that 

Clearly, we’ll need to find Using the chain rule, you should be able to 

show that /’($) = 2xe x2 . So the above equation becomes 


2ce c = 


-e° 2 


-0 


So we have 2ce° 2 — e + 1 = 0, and we have shown that our original equation 
above does have a solution. In fact, we’ve shown that there’s a solution 
between 0 and 1. 
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Now we have assumed that f r is always equal to 0, the quantity 尸 (c) 
must be 0. So the above equation says that 

x-S 5 

which means that f(x) = f(S). If we now let C = f(S), we have 
shown that f(x) = C for all x in the interval (a, 6), so / is constant! In 
summary, 

if f'{x) = 0 for all a; in (a,b), then / is constant on (a, b). 


Actually, we’ve already used this fact in Section 10.2.2 of the previous 
chapter. There we saw that if f(x) = sin -1 (x) + cos _ 1 (x), then f f {x) = 0 
for all x in the interval (—1,1). We concluded that / is constant on that 
interval, and in fact since /(0) = 7r/2, we have sin -1 (a;) +cos -1 (a:) = 7 t/2 
for all x in (—1 ， 1). 

2. Suppose that two differentiable functions have exactly the same deriva¬ 
tive. Are they the same function? Not necessarily. They could differ 
by a constant; for example, f(x) = x 2 and g(x) = x 2 1 have the 
same derivative, 2x, but / and g are clearly not the same function. Is 
there any other way that two functions could have the same derivative 
everywhere? The answer is no. Differing by a constant is the only way: 


if f\x) = g\x) for all x, then f(x) = g(x) + C for some constant C. 


It turns out to be quite easy to show this using #1 above. Suppose 
that f r (x) = g r (x) for all x. Now set h(x) = f(x) — g(x). Then we 
can differentiate to get h f (x) = f f (x) — g\x) = 0 for all x, so h is 
constant. That is, h(x) = C for some constant C. This means that 
f(x) — g(x) = C, or f(x) = g(x) + C. The functions / and g do indeed 
differ by a constant. This fact will be very useful when we look at 
integration in a few chapters’ time. 

3. If a function / has a derivative that’s always positive, then it must be 
increasing. This means that if a < 6, then /(a) < f(b). In other words, 
take two points on the curve; the one on the left is lower than the one 
on the right. The curve is getting higher as you look from left to right. 
Why is it so? Well, suppose f f (x) > 0 for all x, and also suppose that 
a < b. By the Mean Value Theorem, there’s a c in the interval (a, b) 
such that 

m= iRzm. 

b — a 

This means that f(b) — f(a) = f f (c)(b—a). Now /’(c) > 0, and b—a>0 
since 6 > a, so the right-hand side of this equation is positive. So we 
have f(b) — f(a) > 0, hence f(b) > /(a), and the function is indeed 
increasing. On the other hand, if f r (x) < 0 for all x, the function is 
always decreasing 、 this means that if a < 6 then f(b) < f(a). The proof 
is basically the same. 
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11.4 The Second Derivative and Graphs 

So far, we haven’t paid much attention to the second derivative. We’ve only 
used it to define acceleration, and that’s about all. Actually, the second 
derivative can tell you a lot about what the graph of your function looks 
like. For example, suppose that you know that f /f (x) > 0 for all x in some 
interval (a, b). If you think of the second derivative /"as the derivative of the 
derivative, then you can write (/’)’(a:) > 0. This means that the derivative 
f(x) is always increasing. 

So what? Well, if you know that the derivative is increasing, this means 
that it’s getting more and more difficult to “climb up” the function. The 
situation could look like this: 


a 


c 


Just to the right of x = a, the mountain-climber has it nice and easy: the 
slope is negative. It’s getting harder all the time, though; first it gets flatter, 
until the climber reaches the flat part at a: = c; then the going keeps on getting 
tougher as the slope increases up to a: = b. The important thing is that the 
slope is increasing all the way from a; = a up to a: = 6. This is exactly what 
is implied by the equation f ,r {x) > 0. 

We need a way to describe this sort of behavior. We’ll say a function is 
concave up on an interval (a, b) if its slope is always increasing on that inter¬ 
val, or equivalently if its second derivative is always positive on the interval 
(assuming that the second derivative exists). Here are some other examples 
of graphs of functions which are concave up on their whole domains: 



They all look like part of a bowl. Notice that you can’t tell anything about 
the sign of the first derivative f(x) just by knowing that /"($)> 0. Indeed, 
the middle two graphs have negative first derivative; the rightmost graph has 
positive first derivative; while the leftmost graph has a first derivative that is 
negative and then positive. 

If instead the second derivative f r, (x) is negative, then everything is re¬ 
versed. You end up with something more like an upside-down bowl, saying 
that / is concave down on any interval where its second derivative is always 
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negative.* Here are some examples of functions which are concave down on 
their entire domain: 



In this case, the derivative is always decreasing: it’s getting easier and easier 
to climb as you go along in each case. If you’re going uphill, this means it’s 
getting less and less steep, but if you’re going downhill, it’s getting steeper 
and steeper downhill (as you go from left to right). 

Of course, the concavity doesn’t have to be the same everywhere: it can 
change: 



To the left oi x = c, the curve is concave down, while to the right oi x = c, 
the curve is concave up. We’ll say that the point x = c is a point of inflection 
for / because the concavity changes as you go from left to right through c. 

IVlaiB a^Qyt.pointS of infl^ejtron 

In the above picture, we see that f’’(x) < 0 to the left of c and /" ⑷ > 0 
to the right of c. What about f’’(c) itself? It must be 0, since everything 
is nice and smooth. In general, if c is a point of inflection, then the sign of 
f’[x) must be different on either side oi x = c, assuming of course that f /f (x) 
actually exists when x is near c. In that case, it must be true that 

if a: = c is a point of inflection for /， then /"(c) = 0. 


On the other hand, if /"(c) = 0, then c may or may not be an inflection point! 
That is, 


if /"(c) = 0, then it’s not always true that a; = c is a 
point of inflection for /. 


*If you have trouble remembering which one is concave up and which is concave down, 
the following rhyme might help: “like a cup, concave up; like a frown, concave down.” 






Section 11.5: Classifying Points Where the Derivative Vanishes • 239 



For example, suppose that f{x) = x 4 . Then f f (x) = 4x 3 and f’(x) = 12x 2 . 
At a: = 0, the second derivative vanishes, because /’’(0) = 12(0) 2 = 0. So is 
a: = 0 a point of inflection? The answer is no. Here’s a miniature graph of 
y = x 4 : 



You can see that / is always concave up; so the concavity doesn’t change 
around a: = 0. That is，$ = 0 is not a point of inflection, despite the fact that 
/" ⑼ = 0 . 

On the other hand, if you want to find points of inflection, you do need 
to find where the second derivative vanishes. That at least narrows down the 
list of potential candidates, which you can check one by one. For example, 
suppose that f(x) = sin(x). We have f f (x) = cos(x) and = — sin(a:). 

The second derivative — sin(ar) vanishes whenever a: is a multiple of n. Let’s 
focus on what happens at a: = 0. We have 尸 ’ ⑼ =—sin(0) = 0. Is a: = 0 an 
inflection point? Let’s take a look at the graph: 



Yes , : r = 0 is a point of inflection: sin ⑷ is concave up immediately to the left 
of 0 but concave down to the right of 0. Notice that the tangent line at a: = 0 
passes through the curve y = sin(a;). This is typical of points of inflection: 
the curve must be above the tangent line on one side and below the tangent 
line on the other side. 


11.5 Classifying Poirtfe Where the Derivative Vanishes 

It’s time to apply some of the above theory to a practical problem. Suppose 
that you have a function / and a number c such that /’(c) = 0. You can 
say for sure that c is a critical point for /, but what else can you say? It 
turns out that there are only three common possibilities: x = c could be a 
local maximum; it could be a local minimum; or it could be a horizontal point 
of inflection, which means that it is a point of inflection with a horizontal 
tangent line.* (It’s also possible that f(x) is constant for all x near c, but in 
that case c is both a local maximum and a local minimum.) In any case, here 
are some pictures of the common possibilities: 


* Another possibility is that the concavity isn’t even well-defined near the critical point. 
For example, if f(x) = x 4 sin(l/a:), then the sign of f n (x) oscillates wildly as x approaches 
the critical point 0 from either above or below, so the concavity keeps switching between 
up and down! 
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Local maximum Local minimum Horizontal point of inflection 


In each case, the tangent line is horizontal; that’s all you can tell if you only 
know that /’(c) = 0. How do you tell which case applies? There are two 
methods, one involving only the first derivative, and the other involving the 
second derivative. When you use the first derivative, you have to look at the 
sign (positive or negative) of the first derivative near x = c. On the other 
hand, if you use the second derivative, then you need to consider its sign at 
x = c. Let’s look at these methods one at a time. 

11.5.1 Using th#Srst derivative 

Let’s take another look at the above cases, but this time we’ll draw in some 
tangent lines near x = c: 



In the first case, we have a local maximum at a; = c. To the left of c, the 
slope is positive. This means that the function is increasing in that portion 
of the domain (as we saw in Section 11.3.1 above). On the other hand, to the 
right of c, the slope is negative: the function is decreasing there. It’s clear 
that whenever the slope changes from positive to negative as you move from 
left to right, the point where the slope is 0 must be a local maximum. 

In the second case, the situation is reversed. If the slope changes from 
negative to positive as you go from left to right, the point where the slope is 
0 must be a local minimum. In the third case, the slope is always positive 
(except at a: = c), while in the fourth case, the slope is always negative (except 
at a; = c). Both cases give a point of inflection: the derivative doesn’t change 
sign. 
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◎ 

◎ 


Here’s a summary of what we have just observed. Suppose that /’(c) = 0. 
Then: 

• if f(x) changes sign from positive to negative as you pass from left to 
right through x = then a; = c is a local maximum; 

• if f\x) changes sign from negative to positive as you pass from left to 
right through x = c, then x = c is a local minimum; 

• if f f (x) doesn’t change sign as you pass through x = c from left to right, 
then x = cis a, horizontal point of inflection. 

For example, if f(x) = x s , then we have /’($) = 3a: 2 . This is 0 when x = 0, 
so x = 0 must be a local maximum, local minimum, or horizontal point of 
inflection. Which is it? Well, f ; (x) is always positive when a: 一 0, so the 
derivative doesn’t change sign as you pass through x = 0 from left to right. 
So a: = 0 must be a point of inflection. Draw the graph and check that this 
makes sense! (You can also find the graph in Section 11.5.2 below.) 

Here^ another example. If we now set f(x) = x\n(x), then where are the 
local maxima, minima, and horizontal points of inflection of /? Well, you 
should use the product rule to find that f f (x) = In ⑷ + 1. (Check that you 
believe this!) We are looking for solutions to the equation f f (x) = 0, which 
means that ln(a:) + 1 = 0. Rearranging, we get ln(x) = —1; now exponentiate 
both sides to get x = e _1 , otherwise known as 1/e. This is the only potential 
candidate. But what sort of critical point is it? 

Well, let’s look at the sign of f f (x) = ln(x) + 1 when x is near 1/e. The 
easiest way to do this is to draw a quick graph oi y = f r {x). All we have to 
do is take our graph for ln(a:) and shift it up by 1. Here’s what we get: 


y = f'{x) 


You can see from the graph that f f (x) goes from negative to positive as we 
pass through 1/e. So a: = 1/e must be a local minimum for /. Now, what is 
the value of /(1/e)? We can plug it in and get /(1/e) = (1/e) ln(l/e) = —1/e, 
noting that ln(l/e) = ln(e _1 ) = — ln(e) = —1. So the graph of y = f(x) has 
a local minimum at the point (1/e, —1/e). It must look something like this: 


V = f(x) = x ln(x) 
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As you can see, we don’t know much about the graph yet! We’ll finish it off 
in Section 12.3.2 of the next chapter. 

11.5.2 Using the second derivative 

Take another look at the common possibilities which arise when /’(c) = 0: 



Local maximum Local minimum Horizontal point of inflection 


Imagine that /’’(c) > 0. We saw in Section 11.4 above that this means that 
the curve y = f(x) is concave up near x = c. The only one of the above 
four graphs which is concave up is the second one, that is, the case of a local 
minimum at x = c. Similarly, if /"(c) < 0, then the curve is concave down, 
and we must be in the first case above: c is a local maximum in that case. 

This is pretty useful, but there’s a catch: if /"(c) = 0, then you could 
be in any one of the four cases! For example, suppose that f(x) = x s and 
g(x) = x 4 . We have f r {x) = 3a: 2 , so /’(0) = 0. Let’s find /’’(0) to try to 
classify the critical point. Since f”(x)= 6x, we have f 〃 (0) = 0. 

On the other hand, what about gl As we saw in Section 11.4.1 above, we 
have g r {x) = 4a: 3 , so 〆(()）= 0. What sort of critical point is a: = 0? Let’s 
check the second derivative: g n {x) = 12a; 2 , so 夕 "(0) = 0. 

In both cases, at the critical point a: = 0, the second derivative is 0. As 
you can see from the miniature graphs below, / has a point of infection at 
x = 0 while g has a local minimum there: 



J 


J 

1 

t 

y = f(x) = x 3 

y = 9(x) = x 4 


So much for using the second derivative to distinguish between these two 
cases. When the second derivative is 0, you are so in the dark, you might as 
well be in an underground room with your eyes closed and one of those really 
thick blindfolds on. You just can’t tell whether you’re dealing with a local 
maximum, a local minimum, or a horizontal point of inflection. So, here’s the 
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UmK summary of the situation. Suppose that /’(c) = 0. Then: 

® • if /"(c) < 0, then x = c is a local maximum; 

• if /"(c) > 0, then x = c is a, local minimum; 

• if /"(c) = 0, then you can’t tell what happens! Use the first deriva¬ 
tive test from the previous section. 



Yes, the first derivative test is better, although it’s a little more cumbersome 
to use. It always works, while the second derivative test sometimes lets you 
down. Here’s an example where things do work out, though: suppose that 
f(x) = x ln(a:). Hey, this is the same example as one from the previous section! 
There we saw using the first derivative test that 1/e is a local minimum for 
/. Let’s try using the second derivative test instead. 

First, recall that f\x) = In (a:) + 1， so /’ （ 1/e) = 0. We can easily see 
that f f/ (x) = 1/x. When x = 1/e, we have /"(1/e) = e, which is positive. 
So the concavity is upward at 1/e, which means that we’re dealing with the 
bowl shape; indeed, according to the above summary, 1/e is indeed a local 
minimum. 
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Sketching Sraphs 


Now it’s time to look at a general method for sketching the graph oiy = f(x) 
for some given function /. When we sketch a graph, we’re not looking for 
perfection; we just want to illustrate the main features of the graph. Indeed, 
we’re going to use the calculus tools we’ve developed: limits to understand 
the asymptotes, the first derivative to understand maxima and minima, and 
the second derivative to investigate the concavity. Here’s what we’ll look at: 

• the useful technique of making a table of signs; 

• a general method for sketching graphs; and 

• five examples of how to use the method. 


12.1 Howto Construct a Tdble of Signs 


◎ 


then the zeroes of / are 3 and 1, and / is discontinuous at 0 and —2. So our 
list, in order, is —2, 0, 1, 3. Now, draw a table with three rows and plenty of 
columns. We’ll label the first two rows x and f(x); the third row will actually 
be blank. Now, write the values in your list of zeroes and discontinuities 
across the top row so that there’s one space on either side of each number. In 
our example, the table would look like this: 


Suppose you want to sketch the graph of y = f(x). For any number x, the 
quantity f(x) could be positive, negative, zero, or undefined. Luckily, if / is 
continuous except for maybe a few points, and you can find all of the zeroes 
and discontinuities of /, then it’s easy to see where f(x) is positive and where 
it’s negative by using a table of signs. 

Here’s how it works: start off by making a list of all the zeroes and dis¬ 
continuities of / in ascending order. For example, if 


/⑷ = 


(x - 3)(x — l) 2 
x s (x + 2) ’ 
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X 


-2 


0 


1 


3 


/ ⑻ 





















Now you can fill in some of the second row ― just put a 0 where f(x) is 0 and 
a star where / is discontinuous: 


X 


-2 


0 


1 


3 


/ ⑷ 


★ 


ic 


0 


0 













Next, pick your favorite number between each of the special numbers on the 
top, as well as one at the beginning and one at the end. In our example, you 
might pick —3 as being to the left of —2; and —1 as being between —2 and 0; 
and so on, until the table looks something like this: 


X 

-3 

-2 

一 1 

0 

1 

2 

1 

2 

3 

4 

/ ⑷ 


★ 


ic 


0 


0 













We could have chosen —4 instead of —3, or ^ instead of ^ — it wouldn’t have 
made any difference. We can pick any number between the special numbers. 
Now, the next thing is to find whether f(x) is positive or negative for each of 
the values we just chose. In our example, consider x = —3; then 


/(_3) 


(-3 - 3)(-3-l) 2 
(-3) 3 (-3 + 2) 




So we can put a minus sign in the box under —3. Now we didn’t actually 
need to work that hard, since we could care less about the value of /(—3): 
we only care whether it’s positive or negative. We should just have looked 
at each factor to see whether it’s positive or negative. In particular, when 
x = —3, you can see that (x — 3) is negative, (x — l) 2 is positive (it can’t be 
negative since it’s a square!), x s is negative, and (x + 2) is negative as well. 
The overall effect is 

(-)(+) _ 

㈠㈠ _ ， 


so /(—3) is negative. Now try it for each of our other numbers, and verify 
that you can fill in the whole table like this: 


X 

-3 

-2 

-1 

0 


1 

2 

3 

4 

/ ⑻ 

— 


+ 

'k 


0 

— 

0 

+ 
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The main point is not that /(—3) is negative, but that f(x) is negative for 
all x < —2. The number —3 is just a representative sample point for the 
region (—oo, —2). Whatever sign /(—3) is, f(x) has the same sign on the 
whole region. Similarly, since /(—1) is positive, f(x) is positive on the entire 
interval (—2,0). Already this gives us lots of information about the graph of 
y = f(x), which we’ll look at in Section 12.3.1 below. 

Here’s another example. Suppose that 

f(x) = x 2 (x — 5) 3 . 


We’ve actually already looked at this function / a little bit in Section 10.1.4 of 
Chapter 10. Let’s take a closer look, starting with a table of signs. The zeroes 
of / clearly occur at a: = 0 and x = 5 only, and there are no discontinuities. 
So our special points are at 0 and 5. We need to fill in the gaps. To the left 
of 0, I’ll choose —1; in between I’ll choose 2; and to the right, I’ll choose 6. 
So our table of signs looks like this: 


X 

-1 

0 

2 

5 

6 

/o) 


0 

— 

0 

+ 








Here’s how I came up with the signs at —1, 2, and 6: 

• When x = —1, both x and (x — 5) are negative. The sign of /(—1) is 
therefore (—) 2 (—) 3 = (+)(_)= |-)• 

• When x = 2, now x is positive and (x — 5) is negative. The sign of /(2) 
is (+) 2 (—) 3 which is still (—). 

• When x = 6, now both x and (x — 5) are positive, so /(6) has sign 

C+f (+) 3 = (+)• 

We’ll use this table to help us sketch the graph of y = f(x) in Section 12.3.3 
below. For now, let’s see how to make a table of signs for the derivative and 
the second derivative. 


12.1.1 Making ^ Jfabte of signs derivative 



As we saw in Section 11.3.1 of the previous chapter, the sign of the derivative 
of a function tells you a lot about the function. Whenever the derivative 
is positive, the function is increasing; when the derivative is negative, the 
function is decreasing; and when the derivative is 0, the function has a local 
maximum, a local minimum, or a horizontal point of inflection. A table of 
signs for the derivative can summarize all this information in a compact, 
simple way. 

The method is the same as for the table of signs for f(x) that we looked at 
above, except that now you apply it to f f (x) instead. The only other difference 
is that when f(x) is zero, we’ll put a little flat line in the third row; when 
f(x) is positive, the line will slope upward; and when f f (x) is negative, the 
line will slope downward. 

Let’s see how it works for our previous example where f(x) = x 2 (x — 5) 3 . 
In Section 10.1.4 of Chapter 10, we calculated that f f (x) = 5x(x— b) 2 (x — 2). 
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(Try it yourself if you don’t want to look back!) This means that f f (x) = 0 
when x = 0, a: = 2 or ar = 5. Let’s pick some points in between: we’ll choose 
—1 to the left of 0; between 0 and 2, we’ll pick 1; between 2 and 5, we’ll choose 
3; finally, we’ll select 6 to the right of 5. Our table of signs looks like this, so 
far: 


X 

-1 

0 

1 

2 

3 

5 

6 

/' ㈤ 


0 


0 


0 












Now we need to find the sign of f f (x) at the new points we chose. For example, 
when x = —1, we see that 5x is negative, (a; —5) is negative, and (a: —2) is also 
negative, so /'(_1) has sign (—)(—) 2 (—) = (+)• I leave it to you to repeat 
this exercise with the other values and verify that the filled in table looks like 
this: 


X 

-1 

0 

1 

2 

3 

5 

6 

f{x) 

+ 

0 


0 

+ 

0 

+ 


/ 

— 

\ 

一 

/ 

= 

/ 


Notice how I drew the little lines in the third row: upward-sloping when 
f f (x) has sign (+)，downward when its sign is (—)，and flat when its sign is 
0. We immediately know that / is increasing when x < 0 and when a: > 2, 
while it’s decreasing for 0 < x < 2. The table also reveals that a: = 0 is a 
local maximum, a: = 2 is a local minimum, and a: = 5 is a horizontal point 
of inflection. We’ll use the above table again when we sketch the graph of 
y = f(x) in Section 12.3.3 below. 

A word of warning: the lines in the third row of the table are meant only 
to guide you as you sketch the graph ofy = f(x). The graph probably doesn’t 
look like a collection of lines tacked together! Instead, just use the information 
in that third row to understand where the graph is increasing, decreasing or 
temporarily flat. 


12.1.2 Making a table of signs for the second derivative 



We’ve also seen that the sign of the second derivative is important (check out 
Section 11.4 of the previous chapter). When the sign is positive, the curve is 
concave up; when the sign is negative, the curve is concave down; and when 
it’s 0, you may or may not get a point of inflection. The table of signs for the 
second derivative tells all. 

The method is the same as for the function or the derivative, except that 
the third row is now used to show whether the function is concave up or 
concave down. Put a little upward parabola-like curve whenever the sign is 
(+), a downward version when the sign is (—)，and a dot when the sign is 0. 

If we return to our example f(x) = x 2 (x—b) s from above, we have already 
seen that /’($) = ^x(x — b) 2 (x — 2). To differentiate this, let’s combine the x 











1) factors to write f f (x) = 5(x — 5) 2 (x 2 — 2x). Now we can use the 
lie: 

f(x) = 5 ((x 2 - 2x) x (2(x - 5)) + (a: - 5) 2 (2o:-2)). 

common factor of (x — 5) and rearranging, we find that we have 
0(a: — 5)(2a: 2 — 8$ + 5). Actually, you can use the quadratic formula 
;the solutions of 2a: 2 — 8$+5 = 0 are So we can completely 

x) as 

f\x) = 20 ($ _ (2 - (: c _ (2 + (a: _ 5). 

is that f /f {x) has sign 0 at a: = 2 — ^\/6, x = 2 and x = b. 
; on our table of signs for 


X 


2-|^ 


2+i\/6 


5 


m 


0 


0 


0 











ave to fill in the gaps. It would be nice to know something more 
so let’s try to estimate it without resorting to a calculator! You 
between 2 and 3 (since 6 is between 4 and 9), so is between 
This means that 2 — is somewhere between 2 — 暑 =| and 
and also that 2 + is between 2 + 1 = 3 and 2 + | = 3|. So we 
3 0 to the left of 2 — |\/6 ； between 2 — \必 and 2 + we’ll pick 
l 2 + ^y/6 and 5, we’ll choose 4; finally, we’ll pick 6 to the right of 
what we get: 


X 

0 

2-|\/6 

2 

2 + ^ V^6 

4 

5 

6 

m 

了 

0 

+ 

0 


0 

+ 



• 


• 





you agree with all the signs I’ve filled in. For example, when x = 0, 
ictors of f"(x) are negative, so the product is negative. Also, notice 
w in the little curves in the third row. You can clearly see that / 






of the real line as possible. Remember, you have to 
lead to 0 in the denominator, or the square root of; 
the log of a negative number or 0. If inverse trig fi 
the situation is more complicated — so I suggest y« 
of all the inverse trig functions. (For example, you 
sine of a number outside the interval [—1,1].) 
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5. Vertical asymptotes: these generally occur where the denominator is 
zero (if there is a denominator!). Beware: if the numerator is zero too, 
then you might have a removable discontinuity* instead of a vertical 
asymptote. Also, you may have a vertical asymptote due to a log factor. 
Mark all the vertical asymptotes as dotted vertical lines on your graph. 

6. Sign of the function: at this point, draw up a table of signs for f(x), 
as described in Section 12.1. We already know where / is zero from 
#3 above, and we know where it’s discontinuous from #4 and #5. The 
table tells you exactly where the curve is above or below the a:-axis. 

7. Horizontal asymptotes: find the horizontal asymptotes by calculating 

lim f(x) and lim fix). 

x—^oo x—^—oo 

Even if the limits are 士 oo, it may be that you can still work out what 
f(x) behaves like for large (or negatively large) x and thereby get a sort 
of “diagonal” asymptote. In any case, draw dashed horizontal lines on 
your graph to remind you about the horizontal asymptotes, if there are 
any. At this point, you can fill in little bits of the function near both 
the horizontal and vertical asymptotes, using the table of signs for f(x) 
to tell which side of each of the asymptotes the function lies on. 

8. Sign of the derivative: now, time for calculus. Find the derivative, 
then find all the critical points —— remember, these are points where the 
derivative is 0 or does not exist. Now draw up a table of signs for f r {x)^ 
as described in Section 12.1.1 above. Use the third row of the table to 
tell where the function is increasing, decreasing, or flat. 

9. Maxima and minima: from the table of signs, you can find all the 
local maxima or minima — remember, these only occur at critical points. 
For each maximum or minimum x, you need to find the value of y by- 
substituting the value of x into the equation y = f(x). Make sure you 
label all these points on your graph. 

10. Sign of the second derivative: find the second derivative, then find 
all the points where the second derivative is zero or does not exist. 
Now you should draw up a table of signs for /’’(x), as described in 
Section 12.1.2 above. The pictures in the third row of the table indicate 
where the curve is concave up and where it’s concave down. 

11. Points of inflection: use the table of signs for the second derivative 
to identify the inflection points. Remember, the second derivative at 
an inflection point has to be zero, and the sign of the second derivative 
has to be different on either side of the inflection point. For each inflec¬ 
tion point x, you need to find the y-coordinate by substituting into the 
equation y = f(x). Make sure these points are labeled on your graph. 

Now, using all the information you’ve gathered, complete the sketch of the 
graph. If anything looks inconsistent, then you might have made a mistake! 
All the information you gather should work nicely together. 


*For example, if f(x) = (x 2 — 3x 2)/(x — 2), then by factoring the numerator as 
(x — l)(a; — 2)，you can easily see that f(x) = x — 1 except at a; = 2, where / is undefined. 
The graph is on page 42. 
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By the way, remember that you can also find the local maxima and min¬ 
ima in step 9 above by looking at the sign of the second derivative (see 
Section 11.5.2 in the previous chapter). This method doesn’t always work, 
though — that’s why I recommend using the table of signs for 


■_.3 Examples 

We’ll start with an example of sketching a curve without using the first or 
second derivatives, then look at four more examples of the complete method. 


◎ 


An example without using derivatives 

At the beginning of Section 12.1 above, we looked at 


/(:= 


(x — 3)( 怎 —l) 2 

x s {x-^ 2 )~ 


Let’s sketch y = f(x) using only the first seven steps of our program: 

1. Symmetry: you can plug in —x instead of x, and play around with it, 
but it’s a lost cause: the function is neither odd nor even. 


2. y-intercept: set a: = 0; then the denominator vanishes and the nu¬ 
merator doesn’t. So the function blows up at a: = 0 and there’s no 
y-intercept. 

3. aj-intercepts: set y = 0; then we must have x — 3 = 0 or x — 1 = 0, so 
the ^-intercepts are at 1 and 3. 

4. Domain: clearly we’re fine for all x except x = 0 and x = 2. 

5. Vertical asymptotes: the denominator vanishes when a: = 0 or when 
x = —2; the numerator doesn’t also vanish there, so these are the vertical 
asymptotes. 

6. Sign of the function: we already investigated this thoroughly, and 
found that the function is positive on (—2,0) and (3,oo) and negative 
everywhere else (except at the x-intercepts and vertical asymptotes). 
For reference, here’s the table we saw in Section 12.1: 


X 

-3 

-2 

-1 

0 

1 

2 

1 

2 

3 

4 

/o) 


★ 

+ 

ic 


0 

— 

0 

+ 












7. Horizontal asymptotes: we need to look at 


lim 


(x — 3)(x — l) 2 
x s (x + 2) 


and 


lim 

4 — 00 


(x — 3)(x — l) 2 
X s (x + 2) 


I leave it to you to show that both these limits are 0 (using the methods 
of Section 4.3 in Chapter 4)，so there’s a two-sided horizontal asymptote 
at t/ = 0. 






ising the table of signs once 
same way. Now consider the 
l the curve, since the sign of 
r hand, the function changes 
Lere passes through the axis, 
tiing like this: 

_ Qr-3)Qg-l) 2 
x 3 (x + 2) 



e of the graph. The problem 
nima are except for the local 
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maximum at a; = 1. Certainly there’s at least one local minimum between 
x = —2 and x = 0, at least one local minimum between x = 1 and x = 3, 
and at least one local maximum greater than x = 3. There could be more, 
though — the graph might have a lot more wobbles than shown. We can’t tell 
without using the derivative. 

So why not use the derivative? For this function, it’s too difficult to deal 
with! If you go to the trouble of calculating it, you will find that 

、 —x 4 H- 10x 3 — llx 2 — 16$ + 18 

f{x) = - ^(x + 2)2 - . 

Actually, we know a: = 1 is a local maximum, so /’(l) should be 0. You can 
check and see that the numerator does indeed vanish at x = 1. This means 
that (x — 1) is a factor of the numerator, and you can do a long division to 
see that the numerator is (x — 1)(—x s + 9x 2 — 2x — 18). That still leaves a 
nasty cubic to deal with. At least we do know one thing: the cubic has at 
most three solutions. This means that, in addition to a; = 1, there are at 
most three other critical points. In particular, our graph doesn’t have extra 
wobbles — just the four critical points you can see in the picture above. 

As for using the second derivative to find the concavity and points of 
inflection, well, suffice it to say that it’s even worse than the first derivative! 
On the other hand, not every function has such difficult derivatives — let’s look 
at four more examples where we can use the full method. 

The full method: example 1 

At the end of Section 11.5.1 in the previous chapter, we saw that if 
f(x) = x\n(x), 

then / has a local minimum at x = 1/e. We even started to sketch its graph. 
Let’s use the full method to complete the graph of y = f{x): 

1. Symmetry: the function isn’t even defined for a: < 0, so it certainly 
can’t be odd or even. 

2. 2 / -intercept: set a: = 0; then f(x) is undefined, so there can’t be any 
y-intercept. 

3. cc-intercepts: set y = 0; then we must have a: = 0 or ln(ar) = 0. We 
can’t have a: = 0, since / isn’t defined there, and ln(x) = 0 only when 
x = 1. So the only ^-intercept is at a: = 1. 

4. Domain: because of the ln(a:) factor, the domain must be (0, oo). 

5. Vertical asymptotes: the \n(x) factor might actually introduce a ver¬ 
tical asymptote at a; = 0. Let’s check it out. Since f(x) is only defined 
when a: > 0, the best we can do is to consider the right-hand limit 

lim + xln(a:). 


Actually, we know from Section 9.4.6 that this limit is 0, as logs grow 
slowly (to —oo) as x — 0+. So there are no vertical asymptotes; there’s 
just a (right-hand) removable discontinuity at the origin. 
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6. Sign of the function: we know that the function is undefined for 
a: < 0, and the only x-intercept is at x = 1. So we need to fill in the 
gaps with something like x = 1/2 and x = 2. When x = 1/2, we have 
ln(l/2) = — ln(2), which is negative, so / has sign (—). When x = 2, 
you can easily see that / has sign (+)• So the table of signs looks like 
this: 


X 

<0 


1 

2 

/ ㈤ 



0 

+ 







7. Horizontal asymptotes: we only need to look at 
lim x ln(a:) 

since the limit as x ^ — oo doesn’t even make sense. The above limit is 
clearly oo, since both x and In ⑷ go to oo as a: ^ oo. So there are no 



8. Sign of the derivative: by the product rule, we have f f (x) = In ⑷ +1 
(as we saw in Section 11.5.1 of the previous chapter). So f f (x) = 0 when 
ln(a:) = — 1, that is, when x = e 一 1 = 1/e. We just need to pick a point 
between x = 0 and x = 1/e, and some other point greater than x = 1/e. 
Let’s choose x = 1/10 for the first and x = 1 for the second. Note that 
f(l/10) = ln(l/10) + 1 = — ln(10) + 1, which is clearly negative; and 
f (l) = ln(l) + 1 = 1， which is positive. Our table of signs for f f (x) 
looks like this: 


X 

<0 

TO 

1 

1 

f(x) 

★ 


0 

+ 



\ 

— 

/ 


9. Maxima and minima: looking at the table of signs, we see that we 
only have a local minimum at a; = 1/e. We just need to calculate the 
2 /-value there: we have y = e _1 ln(e _1 ) = —e _1 = —1/e. So there is a 
local minimum at (1/e, —1/e), as we already observed in Section 11.5.1 
of the previous chapter. 

10. Sign of the second derivative: since f r (x) = ln(a:) + 1， we have 
f’[x) = 1/x. Since / is only defined when a: > 0, we see that f n {x) > 0 
for all relevant x. This means that / is always concave up. 

11. Points of inflection: since f f/ (x) is never 0, there aren’t any! 

Now, let’s put the information we’ve gathered on a graph. We have a remov¬ 
able discontinuity at the origin, a local minimum at (1/e, —1/e), an ^-intercept 
at 1, and no horizontal or vertical asymptotes. The graph is below the x-axis 
when x < 1 and above it when x > 1. Also, the function is decreasing for 
0 < a: < 1/e and increasing when x > 1/e, and is always concave up. Its 
graph must look something like this: 
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It’s not perfect, but it’s a heck of a lot better than our first attempt on 
page 241, since we have a lot more information. 


12.3.3 The full method: example 2 

© Let’s look at another function we’ve already investigated somewhat: 


f{x) = x 2 (x - 5) 3 


In Section 10.1.4 of Chapter 10, we already made a rough sketch of the graph 
of 2 / = f(x)] we’ve also made tables of signs for /(a:), /’($)，and f /f (x) in 
Section 12.1 above. This means that we can step on the gas and rip right 
through our method: 

1. Symmetry: if you replace x by (—x), you get (—x) 2 (—x — 5) 3 , which 

simplifies to -x 2 {x-\-b) 2 . This is neither f(x) nor so / is neither 

odd nor even. Oh well, you can’t win them all. 

2. y-intercept: when x = 0, we see that y = /(0) = 0. So the y-intercept 
is at y = 0. 

3. ^-intercepts: if ^ / = 0, then we must have x 2 = 0 or (a: — 5) 3 = 0. So 
the ^-intercepts are at x = 0 and x = 5. 

4. Domain: there are no problems taking f(x) for any x, so the domain 
is the set of all real numbers R. 

5. Vertical asymptotes: since the domain is all of R, there aren’t any 
vertical asymptotes. 

6. Sign of the function: as we saw in Section 12.1, the table of signs 
looks like this: 


X 

-1 

0 

2 

5 

6 

/ ⑷ 

— 

0 

— 

0 

+ 








So the graph is only above the x-axis when x > 5. 

7. Horizontal asymptotes: it’s pretty easy to see that 

lim x 2 (x — 5) 3 = oo and lim x 2 (x — 5) 3 = — oo. 

x—^oo x—^—oo 


After all, when x ^ oo, both x 2 and (x — 5) 3 also go to oo, so their 
product does as well. When x — oo, the x 2 term goes to oo and the 
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(x — 5) 3 term goes to —oo, so the product goes to —oo. We might note 
that when x is large (positive or negative), the quantity (x — 5) behaves 
like its highest-degree term x\ so x 2 (x — 5) 3 behaves like x 5 near the 
edges of the graph, but not near the origin! 

8. Sign of the derivative: as we saw in Section 12.1.1 ， the table of signs 
for f f (x) is as follows: 


X 

-1 

0 

1 

2 

3 

5 

6 

/' ⑷ 

+ 

0 


0 

+ 

0 

+ 


/ 

— 

\ 

一 

/ 


/ 


This tells us where the function is increasing, decreasing or flat. 

9. Maxima and minima: we see from the above table that a: = 0 is a local 
maximum, x = 2 is a local minimum, and x = 5 is a horizontal point of 
inflection. Now we need to calculate the corresponding ^-coordinates by 
using the formula y = f(x) = x 2 (x — 5) 3 . This isn’t too bad: /(0) = 0, 
/(2) = (2) 2 (—3) 3 = —108, and /(5) = 0. So there’s a local maximum 
at the origin, a local minimum at (2,-108) and a horizontal point of 
inflection at (5,0). 

10. Sign of the second derivative: we already found this in Section 12.1.2: 


X 

0 

2-lVe 

2 

2+i\/6 

4 

5 

6 

m 

— 

0 

+ 

0 


0 

+ 





• 


T 



We can use this to see where the function is concave up and where it’s 
concave down. Notice that /"(0) < 0, which confirms that the critical 
point a: = 0 is a local maximum; and also that /"(2) > 0, confirming 
that the critical point a; = 2 is a local minimum. 

11. Points of inflection: from the above table, we have points of inflection 
a,tx = 2 —x = 2+*and x = 5. Actually, we already knew about 
this last one, since we saw in step 9 above that (5,0) is a horizontal point 
of inflection. The other two are a lot messier. We need to substitute 
x = 2 — and a; = 2 + |\/6, one at a time, into the original equation 
y = x 2 (pc — 5) 3 . Unfortunately, you get a bit of a mess. Let’s cheat a 
little and define a = /(2 — and (3 = /(2 + | V^). This means that 

a = (2 - |V6) 2 (-3 - and /? = (2 + |^6) 2 (-3 + 

Actually, if you go to the trouble of multiplying everything out, you can 
simplify these expressions, but it’s no fun at all. We might also make a 
rare use of a calculator to see that a is approximately —45.3 and /? is 
approximately —58.2. These are approximations only! The calculator 
can never give you the true value of an irrational number such as a 
or (3. Anyway, we have found points of inflection at (2 — a) and 
(2 + § a/6, p) as well as (5,0). 
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Now let’s put everything together. Starting with a set of axes, mark in the 
^-intercept at the origin, the ^-intercepts at 0 and 5, the local maximum at 
the origin, the local minimum at (—2,108), the horizontal inflection point at 
(5,0), and the nonhorizontal inflection points at (2— a) and (2+| /?). 

We also know that y — oo as x — oo, and y —> —oo as a: ^ —oo, so we can 
put a small piece of curve to indicate this. Altogether, here’s what we get: 




1 

2 + ^Ve 

\ 


\ 1 1 

2 

1 1 
.. 5 

一 

/3 '• 


1 

—108 ■、 



Note that we know from the table of signs for f f (x) that the slope at the 
inflection point (2 — is negative and that the slope at (2 + is 

positive. Now all we have to do is join the pieces: 



Again, this is better than our previous attempt at sketching this graph on 
page 207, because it shows the inflection points as well. 
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12.3.4 The full method: example 3 



Let’s sketch the graph of y = f(x), where 

f(pc) = xe~^ /2 . 

1. Symmetry: replace x by (—x) and we get —xe~^~ x ^^ = —a;e _3x2 / 2 , 

which is just This means that the function is odd, which is a 

major bonus: we only have to graph it for x > 0, then it’ll be easy to 
get the other half. 

2. 2 / -intercept: if a: = 0, then y = Oe - 3 ( 0 ) 2 / 2 = 0. So the "-intercept is at 
y = 0. 

3. cc-intercepts: if y = 0, then we have 0 = xe~ 3x2 ^ 2 . So either a: = 0 or 
e -3x 2 /2 _ The latter equation has no solution, since exponentials are 
always positive! So the only ^-intercept is at x = 0. So far, all we know 
is that the function is odd and the only place it crosses the axes is at 


the origin. 


4. Domain: clearly you can make x equal to anything and never have a 
problem — there are no square roots or logs, and even if you write the 
function as 



the denominator can’t be zero since exponentials are always positive. So 
the domain is the real line R. 


5. Vertical asymptotes: there aren’t any, since the domain is M. 

6. Sign of the function: we know that the only place f(x) = 0 is when 
a: = 0, so the table of signs is ridiculously simple: 


X 

-1 

0 

1 

/ ㈤ 


0 

+ 






The function is positive when a: > 0 and negative when a: < 0. 
7. Horizontal asymptotes: we need to find 


and 


lim 

a:—_oo 


X 

e 3W/2 . 


Note that 3x 2 /2 is a large positive number in either case, so the de¬ 
nominator is a large exponential. Since exponentials grow quickly (see 
Section 9.4.4 in Chapter 9), both the above limits are 0. So there is a 
two-sided horizontal asymptote at y = 0. 

8. Sign of the derivative: now we have to differentiate. By the product 
rule and the chain rule, you can check that 

f(x)=x(-Sx)e~ 3x2 / 2 + e- 3x2 / 2 = (1 - ^x 2 )e-^/ 2 . 


This is defined everywhere, but where is it 0? Since exponentials are 
positive, it is only 0 when 1 — 3x 2 = 0, that is, when x = 1/y/S or 







using the product rule and chain rule once more. We find that 


3x 2 )(-3x)e~ 3x2 / 2 + (-6a;)e— 3x2 / 2 = 9x(x 2 - l)e~ 3x ^ 2 . 

Once again, since exponentials are positive, the only way that can 
equal 0 is if $ = 0 or a: 2 — 1 = 0, that is, if a: = 0, x = 1 or a: = —1. The 
table of signs looks like this: 


X 

-2 

-1 

l 

一 2 

0 

1 

2 

1 

2 

r(x) 

— 

0 

+ 

0 


0 

+ 


r\ 




r\ 




For x = 1/2, the factor 9x is positive whereas (pc 2 — 1) is negative, 
and the exponential is positive, so the whole thing is negative. When 
x = 2, ifs just as easy to see that the second derivative is positive. The 
situation for a: = —1/2 and ar = — 2 is just as easy and in fact follows by 
symmetry. (Since the original function is odd, its derivative is even and 
its second derivative is odd. You may have to think about this point a 
little!) The third row indicates that the graph is concave down when 
a: < —1 or 0 < a: < 1, and concave up when a; > 1 or —1 < a: < 0. By the 
way, notice that at the critical point x = 1/V^, the second derivative is 
negative — this confirms that we have a local maximum there. Similarly, 
when x = -l/\/3, the second derivative is positive, so we do indeed 
have a local minimum there. 

Points of inflection: from the above table, we can see that the concav¬ 
ity clearly changes at x = 1, x = —1, and a: = 0; so these are all points 
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of inflection and we just need to find the ^-coordinates. By substituting 
in the equation y = are 一 3a;2 / 2 , it’s easy to see that the points of inflection 
should be displayed on the graph as (l,e -3 / 2 ), (—1，一e - 3 ’ 2 ) and (0,0). 

If you’ve been really good, you would have been plotting what we already 
know on a set of axes, and you should have something like this: 


e -l/2 

■W — 

e -3/2 — 

/ - 

1 1 •* 

_1 "75 

* 1 1 

7S 1 


_ e -3/2 


e-" 2 


On this graph, you can see the x- and y-intercepts (at the origin), the horizon¬ 
tal asymptote (the x-axis)，the maximum at (1/\/3, e - " 2 /v^), the minimum 
at (―1/>/3, —e _1 / 2 /-\/3), and the inflection points at (l,e -3 / 2 ), (—1, —e -3 / 2 ), 
and (0,0) (shown as dotted lines for now). Because we know the sign of f(x) 
from step 6, we’ve even diagnosed the behavior near the horizontal asymp¬ 
totes and displayed this information on the graph. Anyway, all that’s left is 
to connect the dots: 



This sketch really illustrates all the important features of the graph. 
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12.3.5 The full method: example 4 

Now let’s do it all over again: we’ll sketch the graph oiy = f(x), where / is 
the fearsome-looking function defined by 

" 、 x s — 6x 2 + 13$ — 8 
f(x) = - ^ - • 

1. Symmetry: replacing x by —x, we get (—a: 3 — 6a; 2 — 13a: — 8)/(—a:), 
which is neither f(x) nor _/(x), so there’s no symmetry. Bummer. 

2. 2 / -intercept: put x = 0, and you get —8/0 which is undefined. So 
there’s no ^/-intercept. 

3. aj-intercepts: now things get nasty. We need to set y = 0, which means 
that x s — 6x 2 + 13a: — 8 = 0. This is a cubic equation, so factoring might 
be a pain in the butt. The best bet is to try to guess a solution. Try 
x = 1. Well, you get 1 — 6 + 13 — 8 = 0, and it works! (Basically, the only 
nice solutions would be factors of the constant term —8, so if 士 1, 士 2, 
士 4 and 士 8 don’t work, you’re screwed.) Luckily our first try worked 
and we know that (x — 1) is a factor. Now we have to divide: 



x — 1) x 3 - 6a: 2 + 13a: — 8 

I leave it to you to do this division and show that the other factor 
is x 2 — hx 8. Can you factor this quadratic? The discriminant is 
(― 5) 2 — 4(8) = —7, which is negative, so you can’t factor the quadratic. 
That is, we have x s — 6x 2 H- 13$ — 8 = (x — l)(x 2 — 5x 8), and the 
second factor is always positive, so the only ^-intercept is a: = 1. 

4. Domain: the only problem is at a: = 0, so the domain is M\{0}. 

5. Vertical asymptotes: there’s one at a: = 0, since the denominator 
vanishes there but the numerator doesn’t. There can’t be any other 
vertical asymptotes because the function is defined everywhere else. 

6. Sign of the function: write f(x) as 

/(4= (x-1)(x 2 -5 X + 8) 

The only ^-intercept is at a: = 1, and the only discontinuity is at a: = 0, 
so our table of signs looks like this: 


X 

-1 

0 

h 

l 

2 

f(X) 

+ 

★ 

— 

0 

+ 








(Make sure you believe the signs at x = —1, x = 1/2, and x = 2.) 
7. Horizontal asymptotes: consider 


.. X s — 6x 2 + 13a; — 8 . 

lim - and 


x 3 - 6x 2 - \-13x-S 
lim - 
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So where is the derivative equal to 0, and where does it not exist? It’s 
pretty obvious that the only place it doesn’t exist is when x = 0. On 
the other hand, if f f (x) = 0, then we must have 2a: 3 — 6x 2 + 8 = 0. Once 
again we need a solution to a cubic equation; this time, x = 1 doesn’t 
work, so try x = —1. Hey, it does work! After you do the long division, 
you find that you can factor the cubic as 2(x + l)(x — 2) 2 . That is, 

勝 2( 1 -妒 

So the derivative is undefined at x = 0 and it equals zero when x = —1 
or x = 2. Now we can draw up a table of signs for 


X 

-2 

-1 

1 

~2 

0 

1 

2 

3 

f'{x) 

- 

0 

+ 

★ 

+ 

0 

+ 


\ 

_ 

/ 


/ 

= 

/ 


Make sure you check the details of this table! In any case, we can see 
that the function is increasing when x > —1 (except at the critical points 
a; = 0 and x = 2) and the function is decreasing when x < —1. 

9. Maxima and minima: looking at the table of signs, we see that x = —1 
is a local minimum and x = 2 is a horizontal point of inflection. We 
need the y-coordinates; it’s not too hard to see that /(—1) = 28 and 
/(2) = 1. So (—1,28) is a local minimum and (2,1) is a horizontal point 
of inflection. 

10. Sign of the second derivative: we know that ar = 2 is a point of 
inflection, but are there any others? Let’s find out. Use the form 


f{x) = 2a: - 6 + 条 


to find that 


/ » = 2-5 = 


2(x 3 - 8) 

X 3 


So the second derivative is undefined at a: = 0 and it’s zero only when 
x 3 — 8 = 0, so a: = 2. There aren’t any other points of inflection! Let’s 
draw up the table of signs: 


X 

-1 

0 

1 

2 

3 

/" ⑷ 

+ 

★ 

— 

0 

+ 


vy 


r\ 




You can see that the graph is concave up when x < 0 and x > 2, and 
concave down when 0 < a: < 2. By the way, at the critical point x = —1, 
we have f 〃 (x) 〉 0, so we indeed have a local minimum there; on the 
other hand, at the critical point x = 2, we see that /"(2) = 0, which by 
itself wouldn’t have been enough information to confirm the inflection 


















CHAPTER 13 


Optimization and Linearization: 


We’re now going to look at two practical applications of calculus: optimiza¬ 
tion and linearization. Believe it or not, these techniques are used every day 
by engineers, economists, and doctors, for example. Basically, optimization 
involves finding the best situation possible, whether that be the cheapest way 
to build a bridge without it falling down or something as mundane as find¬ 
ing the fastest driving route to a specific destination. On the other hand, 
linearization is a useful technique for finding approximate values of hard-to- 
calculate quantities. It can also be used to find approximate values of zeroes 
of functions; this is called Newton’s method. In summary, we’ll look at 


• how to solve optimization problems, and three examples of such prob¬ 
lems; 

• using linearization and the differential to estimate certain quantities; 

• how good our estimates are; and 

• Newton’s method for estimating zeroes of functions. 

13.1 Optimization 

To “optimize” something means to make it as good as possible. This being 
math, we’re going for quantity over quality here. Suppose there is a certain 
quantity we care about. It could be a number, a length, an angle, an area, 
a cost, an amount of money earned, or one of oodles of other possibilities. If 
it’s a good thing, like amount of money earned, then we’d like to make the 
quantity as large as possible; if it’s a bad thing, like cost, then we’d like to 
make it as small as possible. In a nutshell, we want to maximize or minimize 
the quantity. So in our context, the term “optimize” just means “maximize 
or minimize, as appropriate.” 

:1 康 1.1: An ^optimization example 

In the last few chapters, we’ve spent quite a lot of time learning how to 
find maxima and minima of functions. So far as optimization is concerned, 
normally we would be interested in finding global maxima and minima. In 
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Section 11.1.3 of Chapter 11， we looked at a nice method for doing this. I 
urge you to go back and read this section now to refresh your memory. 

In any case, to use our method, we need to express the quantity as a 
function of one other quantity that we can control. For example, suppose 
that two real numbers add up to 10， but neither number is greater than 8. 
How large could the product of the two numbers possibly be, and how small 
could it be? 


Before we bust out our method, let’s just explore the situation first. If 
one of the numbers is 8, which is as large as either number can be, then the 
other number is 2 and the product is 16. At the other extreme, the numbers 
are both equal to 5 and the product is 25, which is certainly larger than 16. 
Can we make the product larger than 25 or smaller than 16? How about if 
the numbers are 4^ and 5|? Try it and see. 

Now let’s get serious and choose some variables. Suppose that the numbers 
are x and y, and that their product is P. Well, we know that P = xy. The 
quantity we want to optimize is P, but it’s a function of two variables: x 
and y. This doesn’t suit us at all. We really need P to be a function of one 
variable — it doesn’t matter which one. Luckily we have one other piece of 
information: we know that x y = 10. This means that we can eliminate y 
by writing y = 10 — x. If we do that, then P = a:(10 — x). This expresses P 
as a function of x alone. 

One important point, though: what is the domain of P? Sure, you could 
plug any x into the formula x(10 — x) and get a meaningful answer, but we 
know something about x that we haven’t expressed in math terms yet: it 
can’t be more than 8. Actually, it can’t be less than 2 either, or else y would 
be bigger than 8. So x must lie in the interval [2,8]. We should consider this 
to be the domain of P. 

So we have rewritten our word problem as follows: maximize P = x(10—x) 
on the domain [2,8]. Not so bad! We just write P = lOx — a: 2 , so we have 
dP/dx = 10 — 2x. This is 0 when o; = 5, so that’s the only critical point. 
We also could have a maximum or minimum at the endpoints x = 2 and 
x = 8. Our list of potential maxima and minima is therefore 2, 5, and 8. 
When a; = 2 or a; = 8, we see that P = 16, and when a: = 5, we have 
P = 25. The conclusion is that the maximum value of the product is indeed 
25, and this occurs when both numbers are 5. The minimum value is 16, 
which occurs when one number is 8 and the other is 2. Notice that when I 
stated this conclusion, I didn’t mention P, x, or y ， since those were variables 
that I introduced. If the variables aren’t actually given in the problem, then 
you not only have to identify them and pick names for them; you also have 
to write your final conclusion without mentioning them! 

It doesn’t hurt to verify that a; = 5 is indeed a maximum by looking at a 
table of signs* for P f (x), using the formula P r {x) = 10 — 2x: 



! See Section 12.1.1 in the previous chapter. 
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Yup, it’s a maximum. We could also verify that a; = 5 is a maximum by 
looking at the sign of the second derivative, as described in Section 11.5.2 
of Chapter 11. Indeed, P ff (x) = —2, so P // (5) = —2 as well. Since thafs 
negative, we again see that x = 5 is a local maximum (which is also a global 
maximum). Neither of these methods works on the endpoints, though — they 
only work for critical points. 


M.l .2 Optimization problems Itip 



Here’s a way to tackle optimization problems in general: 

1. Identify all the variables you might possibly need. One of them should 
be the quantity you want to maximize or minimize — make sure you know 
which one! Let’s call it Q for now, although of course it might be another 
letter like P, m, or a. 

2. Get a feel for the extremes of the situation, seeing how far you can push 
your variables. (For example, in the problem from the previous section, 
we saw that x had to be between 2 and 8.) 

3. Write down equations relating the variables. One of them should be an 
equation for Q. 

4. Try to make Q a function of only one variable, using all your equations 
to eliminate the other variables. 


5. Differentiate Q with respect to that variable, then find the critical points; 
remember, these occur where the derivative is 0 or the derivative doesn’t 
exist. 


6. Find the values of Q at all the critical points and at the endpoints. Pick 
out the maximum and minimum values. As a verification, use a table of 
signs or the sign of the second derivative to classify the critical points. 

7. Write out a summary of what you’ve found, identifying the variables in 
words rather than symbols (wherever possible). 


Actually, sometimes step 4 can be quite difficult, but you might be able to 
avoid it altogether by using implicit differentiation. We’ll see how to do this 
in Section 13.1.5 below. 


13.1.3 An optimization example 



Let’s see how to apply the method. Suppose that the border of a farm is a 
long, straight fence, and that the farmer wants to fence off a little enclosure 
for some horses to graze in. The farmer is a little eccentric and would like to 
make the enclosure in the shape of a right-angled triangle with the existing 
fence as one of the sides which is not the hypotenuse, like this: 
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Assuming that only 300 feet of fencing are available, and that the farmer wants 
the enclosure to have the largest possible area, what are the dimensions and 
area of the enclosure? 

Let’s pick some variables. We’ll let the base of the triangle be 6, the height 
be /i, the hypotenuse be H (all in feet), and the area be A (in square feet), 
like this: 



Note that the fence is of length h H, and we want to maximize A. That 
completes step 1. Moving on to the second step, consider some extreme shapes 
that you can make out of 300 feet of fencing: 



In the first case, h is nearly 0, while b and H are both almost 300, but the area 
is tiny! In the second case, b is nearly 0, while h and H are both almost 150. 
The area is still very small. So we can do better by some middle-of-the-road 
solution. We have at least determined that b and H are between 0 and 300, 
and that h is between 0 and 150. 

Moving on to step 3, we see that A = ^bh and also that h-\-H = 300. We 
still need one more equation, since we have to condense the three variables 6, 
/i, and H down to one. In fact, we can use Pythagoras’ Theorem to say that 
6 2 + /i 2 = H 2 . 

Now we should try to eliminate some variables. We can take square roots 
and write H = yjb 2 H- h 2 , since we know iJ > 0; substituting into h-\-H = 300, 
we get the equation h + \Zb 2 H- h 2 = 300. Let’s try to eliminate b from this. 
Subtract h from both sides and square to get 

6 2 + /i 2 = (300 - h) 2 = 90000 - 600/i + /i 2 . 

This means that b = V90000 — 600" = 10\/900 — 6"，again since b is positive 
(that is, it can’t be the negative square root!). Finally, the equation A = \bh 
can be rewritten as 

A = • x 10V900 - 6h xh = 5hV900 - 6/i, 

where h lies in the interval [0,150]. That’s step 4. As for step 5, you can use 
the product rule and the chain rule to see that 
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This equals 0 when 100 — /i = 0, that is, when h = 100. Moving on to step 6, 
we substitute h = 100 into the equation for A above, and we get 

A = 5(100)^/900- 6(100) = 500V300 = 5000 

On the other hand, at the endpoint h = 0, we see that ^4 = 0; similarly, when 
h = 150, the quantity 900 — 6h vanishes, so A = 0 once again. The conclusion 
is that A is maximized when h = 100. We can check this with a table of signs. 
This isn’t so bad, since the numerator of dA/dh is just 45(100 — /i), while the 
denominator is always positive. The table of signs for dA/dh looks like this: 


h 

99 

100 

101 

dA/dh 

+ 

0 

— 


/ 

— 

\ 


So h = 100 is indeed a local maximum, as we suspected. 

Now we just have to finish it off. The question asks for the dimensions, 
and we only have one: h = 100. We’d better find b and H. Just look back at 
the equations: we know that h-\- H = 300, so we immediately get H = 200. 
Also, we know that b 2 h 2 = iJ 2 ; plugging in h = 100 and H = 200, 
we can see that b = lOOV^. Finally, we already found that the maximum 
value of A is 5000 So our concluding sentence could go something like 
this: the enclosure of maximal area has base 100feet, height 100 feet, and 
hypotenuse 200 feet, and the area is then 5000\^3 square feet. 


13.1.4 Another optimization example 



Here’s a nice problem. Suppose that you are manufacturing closed, hollow 
cylindrical metal cans. You can choose their dimensions, but the volume of 
a can must be 16 丌 cubic inches. You’d like to use as little metal as possible, 
since the metal costs 2 cents per square inch. What dimensions should the 
cans be to make your costs as low as possible, and how much does each can 


cost in that case? 


As a follow-up problem, how does the situation change if we now take 
into account that the top and bottom of each can have to be welded onto the 
curved bit, and it costs 14 cents an inch to weld? 

Let’s start with the first part. Here’s a diagram of the situation: 




h 


To describe the cylinder, we only need to say what its radius and height are, 
so let’s call them r and h (in inches). We’ll also need the volume V (in cubic 
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inches), since the question mentions it. Also, the cost depends on how much 
metal we use, which is basically the surface area of the cylinder. Let’s call a 
can’s surface area A (in square inches) and its cost C (in cents). The quantity 
C is the one we want to minimize, although it’s pretty obvious that it will be 
minimized if we can also minimize A. (This won’t be true for the follow-up 
question!) 

Now, moving on to step 2 of our method, what happens when the radius 
r is really really small? The height h then has to be large so we can have our 
volume of 16tt cubic inches. We’d get a really tall, skinny cylinder like the 
first picture below. On the other hand, if r is really large, then h has to be 
small, and you get a wide, squat cylinder like the second picture: 



Even though they look pretty extreme, actually they can get weirder. In 
fact, r can be any positive number at all! So there aren’t really endpoints; 
both r and h have to lie in the open interval (0, oo) and we’ll have to be 
careful. In either of the above pictures, it looks like there’s a whole lot of 
metal involved, so the low-cost solution probably looks more like the nicely 
proportioned cylinder above than either of the two extreme ones. 

Now it’s time for step 3: we have to find some equations. We know 
V = 167r; also, since V = nr 2 h for a cylinder, we have our first useful equation: 

167T = 7rr 2 /i. 

We can rewrite this as 16 = r 2 h or 

h 16 

h= 7. 

On the other hand, the surface area of a closed cylinder is 
A = 2nrh + 2irr 2 , 


where the first term in the sum comes from the curved part and the second 
term is from the top and bottom. (If there were no top, the second term 
would just be nr 2 without the factor 2.) Finally, the cost is 2 cents times the 
total area, so we have 

C = 2A = Airrh + 47rr 2 . 


For step 4, notice that both terms on the right-hand side above involve r, so 
it’s easier to get rid of h. Since we saw that h = 16/r 2 , we can just substitute 
and get 

+ r 2 ). 

Great — we’ve expressed C in terms of r, and now the question is to minimize 
C when r lies in the interval (0, oo). We have 


C = Airr 


⑼ + 4 - 2 = 4 -(¥ 


dC 










which exists for all r in (0, oo) and is zero precisely when 



or 2r 3 = 16. This means that r 3 = 8, so r = 2 is the only critical point. How 
about the endpoints? We can’t substitute r = 0 into the formula for (7, but 
we can take a limit: 


lim C = lim Air 

r—^0+ r—^0+ 




= OO. 


The limit is infinite because the 16/r term blows up as r — 0+. This means 
that as the radius goes down to 0, our costs get larger and larger. This isn’t 
what we want at all! So we’ll stay away from that endpoint. How about the 
other endpoint of our interval (0, oo)? Once again, we can’t just set r = oo, 
so we’ll take a limit: 


lim C = lim 

r—oo r—oo 




This time it’s the r 2 term that blows up. No matter, we still need to avoid this 
endpoint. So our conclusion is that r = 2 gives a local and global minimum. 
We can check this by using a table of signs for dC/dr or by looking at the 
sign of the second derivative. Let’s use the second derivative: 



This is always positive when r is in the domain (0, oo); in particular, when 
r = 2, it’s positive, so we must have a local minimum there. 

All that’s left is to find the other variables when r = 2 and write up 
our conclusion. Indeed, when r = 2, we can see that h = 16/r 2 = 4， and 
C = Airrh 4- Airr 2 = 487r. This means that the cheapest shape occurs when 
the radius is 2 inches and the height is 4 inches; each can costs 48丌 cents, 
which is about $1.50 (pretty expensive for a lousy can!). Notice that the 
diameter and the height of the can are the same in this case. 

Now let’s do the follow-up problem. Everything is the same as it was in 
the original problem, except that we now have to add on the welding cost of 
14 cents per inch, so our formula for C will change. How much welding is 
there per can? Well, we need to weld on the top and the bottom, so we’re 
dealing with twice the circumference of each of these circles. That means we 
need to weld twice 27rr, or 47rr, inches per can. This adds a cost of 14 x Attv 
cents per can, so our new formula for C is 
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which equals 0 when 

- 与 +2r + 14 = 0. 

r z 

To solve this equation, multiply through by r 2 ，divide by 2, and switch the 
sign of everything to get 

r 3 + 7r 2 - 8 = 0. 

(Make sure you check that this is right!) Great. Now we have to solve a cubic 
equation. Luckily, something simple works: r — 1. So you can do a long 
division and see that the other factor is (r 2 + 8r + 8) (check this!). So we have 

(r-l)(r 2 + 8r + 8) = 0, 


and either r = 1 or r 2 + 8r + 8 = 0. The solutions of the quadratic equation 
are — 

-8±y/32 
2 ， 

both of which are negative since y/32 is only about 6. So the only critical 
point when r is positive is r = 1. Once again, this is a minimum because the 
costs are infinite at the endpoints (for the same reason as before — the welding 
certainly doesn’t make it cheaper). Alternatively, we have 


d 2 C 

dr 2 



which is actually the same as it was before. So it’s positive, the curve is 
concave up and we do have a minimum when r = 1. 

Now we just need to substitute. We find that h = 16/r 2 二 16， and 
C = 47r(16/l + l 2 + 14 x 1) = 1247T cents, which is nearly $4! Looks like we 
have to cut costs somehow. In any case, the ideal can now has radius 1 inch 
and height 16 inches, and it costs 1247T cents to make. Notice that the optimal 
radius is now less than it was in the first part of the question, which makes 
sense since a smaller radius really cuts down on those expensive welding costs. 


13.1.5: Using implicit differentiafi<gf»: in Q_irnization~ 

Before we move on to our final example, let’s just take another look at the 
first part of the previous example. There we knew that 


◎ 



C = 4?r(r/i + r 2 ) and r 2 h = 16, 

and we minimized C by eliminating the variable h. Another way of doing 
the minimization is to differentiate both sides implicitly with respect to the 
variable r, which is the one we wanted to keep anyway. (See Section 8.1 in 
Chapter 8 for a review of implicit differentiation.) Here’s what we get: 

dC A / dh 0 \ 0 7 2 dh ^ 

——= 47r ( /i + r— + 2r I and 2rh -\-r 2 — = 0. 
dr \ dr J dr 

Check to make sure you agree with this! Anyway, if we solve the second 
equation for dh/dr, then since r — 0, we get 


dh 2rh 2h 

dr r 2 r 








t when 2r = /i, which is what we found before! To 
it here is a minimum, differentiate the above equation 
aore to get 

=4n ( 2 ~f) = 47 r ( 2 + v)- 

from above that dh/dr = —2h/r.) The main thing 
tit-hand side of the above equation is always positive, 
ist r is concave up and we do have a minimum. Of 
r = /i at the minimum doesn’t tell you what either 
find that, substitute into the equation 16 = r 2 h to get 
= 2 and /i = 4 as before. 

redo the follow-up part of the question using implicit 


rhe water is quite shallow for the first mile east of the lighthouse, 
ich deeper after that. It takes your crew only 1 day per mile to run 
a the shallow water, but it takes 5 days per mile to run it in the 
• Show that the quickest way to run the cable is as in the following 
i which all measurements are in miles). and find out how lone 1 it 
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Indeed, if there were other critical points, then they would all be local minima 
as the second derivative is positive. You just can’t have lots of local minima 
without local maxima in between, so there aren’t any. This means that y = 1 
is also the global minimum, which is what we want. 

We have nearly finished: just substitute y = I into the equation for T to 
see that 

T= V(2 -1) 2 + 1 + 5\/l 2 + 49 = V2 + 5>/50 = >/2 + 25^2 = 26v ^， 

so it takes 26days in total (or approximately 36.75 days). 

Before we move on to our next topic, let’s just look at one other way to 
see that y = 1 is a minimum. The trick is to take the expression 

dT — 2-y 丨 by 

~dy~~ y/{2 - y) 2 + 1 V?/ 2 +49 

and rewrite it in a clever way. In the second term on the right, we divide top 
and bottom by y, while in the first term, we divide by (2 — y). Making the 
reasonable assumption that y and (2 — y) are both positive, we can write 


dT 1 5 



What happens when y gets bigger? Well, (2 —y) gets smaller, as does (2 —y) 2 , 
so 1/(2 — y) 2 gets bigger. This means that the denominator in the first term 
gets bigger, so its reciprocal gets smaller, but its negative gets bigger. If you 
have chased this around properly, you’ll have to conclude that when y gets 
bigger, so does the first term above. In the same way, 49/y 2 gets smaller, 
so the denominator of the second term gets smaller, but the term itself gets 
bigger. 

What we’ve just shown，without too much work, is that dT/dy is an in¬ 
creasing function of y, at least on the interval (0,2). Since dT/dy is increas¬ 
ing, its derivative d 2 T/dy 2 is positive! So we have managed to show that the 
second derivative is positive without actually having to calculate it, and we 
conclude that y = 1 is a minimum, once again. 

13.2 linearization 

Now we’re going to use the derivative to estimate certain quantities. For 
example, suppose you want to get a decent estimate of \/lT without using a 
calculator. We know that vTT is a little bigger than y/9 = 3, so you could 
certainly say that vTT is approximately 3-and-a-bit. That’s OK, but you can 
actually do a better job without too much work. Here’s how it’s done. 

Start off by setting f(x) = y/x for any a: > 0. We want to estimate 
the value of /(ll) = VTT, since we don’t know the actual value. On the 
other hand, we know exactly what /(9) is — it’s just \/9 = 3. Inspired by our 
knowledge of f(x) when a; = 9, let’s sketch the graph of y = f(x), and draw 
in the tangent line through the point (9,3), like this: 
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The tangent line, which I’ve written as y = L(x), is very close to the curve 
y = f(x) when x is near 9. It’s not so close when x is near 0. That’s not 
important, since we want to approximate /(II)， and 11 is pretty close to 9. 
In the above picture, the line and the curve are close to each other at a: = 11. 
This means that the value of L(ll) is a good approximation to /(ll) = \/TT. 
Indeed, look how close the two values are on the y-axis in the picture above! 

All this is irrelevant unless we can actually calculate L(ll). So let’s do 
it. The linear function L(x) passes through the point (9,3)，and since it’s the 
tangent to the curve y = f(x) at a; = 9, the slope of L(x) is exactly /’(9). 
Now /’ ⑷ =1/2so /' ⑼ =1/2V^ = 1/6. So, L(x) has slope 1/6 and 
passes through (9,3). Its equation is therefore 

y-3= ^(a:- 9), 

which simplifies to y = x/6 + 3/2. That is, 

T f \ x 3 
L ^ = 6 + 2 - 

Now all we need to do is calculate 1/(11) by substituting x = 11 into the above 
equation. We get 



We conclude that 

\/Tl 

That’s a lot better than 3-and-a-bit! In fact, you can use a calculator to see 
that Vll is 3.317 (to three decimal places), so the approximation 3^ is pretty 
good. 

Linearization in general 

Let’s generalize the above example. If you want to estimate some quantity, try 
to write it as f(x) for some nice function f. In the above examole. we wanted 
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to estimate yTT, so we set / (x) = \fx and realized that we were interested in 
the value of /(ll). 

Next, we pick some number a, close to x, such that f(a) is really nice. In 
our example, we couldn’t deal with /(ll), but /(9) was nice because we can 
take the square root of 9 without any problems. We could have chosen a = 25 
instead, since we understand V25, but this isn’t as good because 25 is quite 
far away from 11. 

So, given our function / and our special number a, we find the tangent to 
the curve y = f(x) at the point (a, /(a)). This tangent has slope / 7 (a), so its 
equation is 

y- f、a) = 

If the tangent line is y = L(x), then by adding /(a) to both sides in the above 
equation, we get 

L(x) = f(a) + f(a)(x - a). 

The linear function L is called the linearization of / at a: = a. Remember, 
we ? re going to use L(x) as an approximation to f(x). So we have 

f(x) ^ L{x) = f{a) + f(a)(x - a), 


with the understanding that the approximation is very good when x is close 
to a. In fact, if x actually equals a, the approximation is perfect! Both sides of 
the above equation become /(a). This isn’t helpful, though, since we already 
understand /(a). The benefit is that we now have an approximation for f(x) 
for x near a. 

Let’s check that our formula works for the example in the previous sec¬ 
tion. We have f(x) = ^Jx and a = 9. Clearly f(a) = /(9) = 3; and since 
f(x) = l/2y/x, we have /’ ⑼ =l/2v^9 = 1/6. According to the formula, the 
linearization is given by 

L ( x ) = f( a ) + f( a )( x -a) = 3 + ^(^-9). 


◎ 


This agrees with our formula L(x) = x/6-\-3/2 from above, which we used to 
find the estimate y/ll = 3^. Now, how would you estimate \/8? We see that 
8 is also close to 9, so we can just use the same linearization: 


\/8 = m 每 L(8) = 3 + ^(8 - 9) = y. 



So the formula L(x) = 3 -h (x — 9)/6 gives a good approximation to y/x for 
any x near 9, not just 11. 

On the other hand, suppose you also want to estimate y/62. It wouldn’t 
be ideal to use L(62) as an approximation. Let’s see what happens if we do: 


1/(62) = 3 + 6 g 9 = ll|- 


Wait a second, y/62 should be a little less than V64? which is 8. The value of 
L(62), which is ll|, is way too high. The problem is that our linearization is 
at a; = 9, while 62 is a long way from 9; so the approximation isn’t very good. 
To estimate v^, you’re much better off using the linearization at a; = 64 








instead. So, set a = 64; we now have f(a) 
This means that our new linearization is gi 


L{x) = f{a) + f'{a){x-a) 
When x = 62, we have 

\/62 = /(62) ^ i(62) = 84 - 
This approximation makes a lot more sense 

The differential 

Let’s take a look at the general situation o] 

Let’s define Ax to be a: — a, so that x = a-\ 
I /(q + Aa:) ^ f(a) -I 
Here’s a graph of the situation: 
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In the above graph, there’s one more quantity marked: this is df, which is 
the difference between the height of P and /(a). It is the amount we needed 
to add to f(a) in order to get our estimate. Since L(a-\-Ax) = f{a)-\-f r {a)/^x^ 

we see that _ 

\df = f'{a)Ax. I 

The quantity df is called the differential of f at x = a. It is an approximation 
to the amount that / changes when x moves from a to a + Ax. 

We’ve actually touched on these ideas before. In Section 5.2.7 of Chapter 5, 
we saw that if y = f(x), then 


/' ㈤ = 


lim 孕 . 

ix—o Ax 



This means that a small change in x produces approximately f r (x) times the 
change in y. This is exactly what the equation df = f f (a)Ax says, taking into 
account that this time we are starting a,t x = a. 

For example, suppose we want to estimate (6.01) 2 . Set f(x) = x 2 and 
a = 6; then you can easily see that f f (x) = 2a;, so that /(6) = 12. We want 
to know what happens when we shift x from 6 over by the amount 0.01; so 
we should set Ax = 0.01. We have 


df = f{a)Ax = / ，⑹ (0.01) = 12(0.01) = 0.12. 



So if we add 0.12 to the value of /(a), we should get a good approximation. 
Since f(a) = /(6) = 6 2 = 36, this means that (6.01) 2 = 36.12. Now look 
at back at Section 5.2.7 in Chapter 5 again: we solved the same example 
there, using basically the same method — we just have some nicer formulas 
now, that’s all! 

Here’s another example of how to use the differential. Suppose that you 
use a ruler to measure the diameter of a round ball and get 6 inches, but this 
measurement is only accurate to 0.5%. If we use our measurement to calculate 
the volume of the ball, how accurate is our result? Let’s use the differential 
to work this out, at least approximately. If the ball has radius r, diameter D, 
and volume V, then r = D/2, so 


F 二 




7tD 3 

~ 6 ~ 


When D = 6, we have V = 7r(6) 3 /6 = 367r. So we’ve calculated the volume 
to be 367T cubic inches, but the true answer might be a little more or a little 
less. To find out how much more or less, let’s use the above boxed formula, 
df = f(a)Ax. We need to change f to V, a to 6, and x to D to get the 
appropriate formula for this case: 


dV = V\6)AD. 


Differentiating the previous formula for V with respect to D, we find that 


V\D)m 


tt(3£) 2 ) 
~ 6 ~ 


nP 2 
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This means that V f (6) = 187T, so 

dV = lSnAD. 


This equation means that if you change the diameter D from 6 to 6 + AD, 
the volume V changes by about 187rAD. In our case, the true diameter might 
be 0.5% more or less than 6 inches, which is 0.005 x 6 = 0.03 inches. So AD 
might be as high as 0.03 or as low as —0.03; in this worst case scenario, we 
have 

dV = 18tt x (±0.03) = ±0.54 tt. 


This is a good approximation to the true error in the measurement, so we 
can say that the volume of the ball is 36 丌 cubic inches, accurate to about 
0.547T cubic inches. Since the original error in the diameter was expressed as 
a percentage of the diameter, we should probably do the same for the volume. 
In percentage terms, an approximate error of dV = 士 0.54 丌 on a quantity 


V = 36tt is 


dV 


x 100% : 


±0.54tt 

36tt 


x 100% = ±1.5%. 


In other words, the relative (percentage) error in the volume measurement 
is about three times the relative error in the original diameter measurement. 
That’s what happens when you compound the error in a one-dimensional 
measurement in the calculation of a three-dimensional quantity. 


13.2.3 : Lin®erization:'suTOfna「y and excamples 

Here’s the basic strategy for estimating, or approximating, a nasty number: 

1. Write down the main formula 

I / ㈤ s L{x) = /(a) + f'{a)(x -a)7| 


2. Choose a function /, and a number x such that the nasty number is 
equal to f(x). Also, choose a close to x such that f(a) can easily be 
computed. 

3. Differentiate / to find /’• 

4. In the above formula, replace / and f r by the actual functions, and a 
by the actual number you’ve chosen. 



5. Finally, plug in the value of x from step 2 above. Also note that the 
differential df is the quantity / 7 (a)(a; — a). 

Let’s look at a few examples. First, how would you estimate sin(ll7r/30)? 
Start off with the standard formula 


f(x) ^ L(x) = f{a) + f(a)(x-a). 

We need to take the sine of something, so let’s set f(x) = sin(x). We are 
interested in what happens when x = 11 丌 /30. Now, we need some number 
a which is close to 11 兀 /30， such that f(a) is nice. Of course, f(a) is just 
sin ⑷. What number close to 11 丌 /30 has a manageable sine? How about 
10 丌 /30? After all, that’s just 7r/3, and we certainly understand sin(7r/3). So, 
set a = 7r/3. 
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We’ve completed the first two steps. Moving on to the third step, we find 
that f f (x) = cos (a:), so the linearization formula becomes 

f(x) ^ L(x) = sin + cos 卜 _ . 

Since f(x) = sin(a:), this simplifies to 

sin(a:) ^ L{x) = f + ■ ( x _ . 

Finally, put x = ll7r/30 to get 




sfl 7T 
^" + 60 - 



This may still seem bad, but at least the estimate doesn’t involve any trig 
functions — only the numbers tt and v^3 5 which are not too hard to deal with. 

Now, consider this example: find an approximation for ln(0.99) using a 
linearization. Well, this time we set f(x) = ln(a:) and note that we are 
interested in what happens when x = 0.99. A number near 0.99 that is nice, 
so far as taking the log of it is concerned, is 1; so we set a = 1. Since 
f(x) = ln(a:) and /’($) = 1/x, the formula f(x) = L{x) = f(a) + f f (a)(x — a) 
becomes 

ln(x) = L(x) = ln(l) + |(尤 _ 1). 

Since ln(l) = 0, we have shown that 

In ⑻ =x — 1. 


Replacing x by 0.99, we get 

ln(0.99) ^ L(0.99) = 0.99 - 1 = —0.01, 



and we’re done. 

More generally, how would you find an approximation for ln(l + "), where 
h is any small number? In fact, you can use the linearization that we just 
found, f(x) = L(x) = x— 1, to approximate ln(l + /i). Just replace x by l-\-h 
and we see that ln(l h) = L(1 + /i) = (1 + /i) — 1. That is, 


ln(l h) = h 

when h is small. Actually, this shouldn’t be a surprise! In Section 9.4.3 of 
Chapter 9, we saw that 

Um '+L l ， 

h — h 



so we already knew that ln(l + ") is approximately equal to h when h is small. 

Finally, how about an approximation for ln(e + h) when h is small? We 
now need a different linearization, as the quantity (e + /i) is close to e, not 
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1. So let’s set a = e and start again, once again using f(x) = ln(x) and 
f’(x) = 1/x. We get 

f(x) = L(x) = / ⑷ + f f {a){x — a) = ln(e) — e). 

Since ln(e) = 1, we get 


\n(x) = L(x) = 1 + — — 1 =—. 

e e 

When x = e h, we have 

ln(e + /i) = L(e h) — 6 + & = 1 + —. 

e e 

That is, ln(e + /i) = 1 + h/e when h is small. This answer is quite different 
from the answer in previous example, where we saw that ln(l h) = h for 
small h. Everything depends on the value of a. 

13.2.4 The error in our approximation 

WeVe been using L(x) as an approximation for f(x). They are not the same 
thing, though. The question is, how wrong could we be to use L(x) instead 
of f(x)? The way to find out is to consider the difference between the two 
quantities. The smaller that distance, the better the approximation. So, set 

r(x) = f(x) - L(x), 

where r(pc) is the error* in using the linearization at a: = a in order to estimate 
f(x). It turns out that if the second derivative of / exists, at least between x 
and a, then there’s a nice formula"^ for r(x): 

r(x) = ^f ；， (c)(x — a) 2 for some number c between x and a. 

The problem is, we don’t know what c is, only that it’s between x and a. The 
above formula is related to the Mean Value Theorem, which we looked at in 
Section 11.3 of Chapter 11. Since that theorem tells you about a number c 
without telling you much about it, we shouldn’t be surprised to see it popping 
up here. 

We can use the above formula to tell us two things. First, note that the 
quantity (x — a) 2 is always positive. This means that the sign of r(x) is the 
same as the sign of /"(c). So if we know that the curve is concave up, at 
least between a and x, then r(x) is positive. Since r(x) = f(x) — L(x), we 
see that f(x) > L(x). This means that our estimate L{x) is lower than f(x), 
so we have made an underestimate. This situation is shown in the graph in 
Section 13.2.2 above. On the other hand, if the curve is concave down, then 


*The letter r in r(x) stands for “remainder,” since it’s what’s left when you remove the 
linearization. 

t See Section A.6.9 of Appendix A for a proof. 
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/’’(c) must be negative; so you can chase it around and see that L(x) > f(x). 
This means that our approximation is an overestimate. 

For example, when we estimated Vll at the beginning of Section 13.2 
above, we used f(x) = y/x. If you calculate that f r {x) = l/2y/x and that 
f’(x) = —1/Ax^/x^ you can see that the curve is always concave down. Or 
you can just see it from the graph. In any case, we see that the estimate that 
we found (3^) must be an overestimate. 

In summary, 

• if j" is positive between a and x, then using the linearization leads to 

an underestimate; 

• if /"is negative between a and x, then using the linearization leads to 
an overestimate. 


Now look back at the equation for the error r(x) above. If we take absolute 
values of both sides of the equation, then we get 


I error I = ^\f"(c)\\x - a\ 2 . 


Suppose we know that the biggest |/"( 亡 )| could be, as t ranges between a and 
is some number M. Then even though we don’t know what c is, we do 
know that |/’’(c)| < M, so we get the following formula: 


◎ 


Again, M is the largest value of \f /f {t)\ for all t between x and a. Actually, the 
important factor in the above equation isn’t the M; it’s the \x — a\ 2 factor. 
You see, when x is close to a, the quantity \x — a\ is small, but when you 
square it, it becomes tiny. (For example, when you square 0.01, you get the 
tiny number 0.0001.) This means that the error is small, so our approximation 
is good! 

Let’s see how this applies to our above example of estimating vTT. We set 
f(x) = y/x, f f (x) = l/2v^ and f’(x) = —l/4 ： xy/x. We also took a = 9 and 
x = 11. The question is, how big could the value of \f r, (t)\ be for t between 
9 and 11? Clearly 




The right-hand side is a decreasing function of so it’s biggest when t is 
smallest, that is, when t = 9. So M = |/ // (9)|, which turns out to be 1/108. 
The conclusion is that 

\error\<lM\x-af = ^\U-^ = ^. 

So when we said earlier that \^11 = 3^, now we have confidence that we’re 
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pretty close. In fact, we are within 士 1/54 of the correct answer. More pre¬ 
cisely, we actually know that 

3 I _ 53 ^ ^ ^ 3 3 + §3- 


In fact, since we discovered earlier that 3^ is an overestimate for vTT, we can 
say more: 



U- ^ - - 


Now, let’s repeat this for the example of estimating ln(0.99), which we 
looked at in Section 13.2.3 above. There we saw that ln(0.99) = —0.01. How 
good is this approximation? With f(x) = ln(a:), we have f f (x) = 1/x and 
f /f (x) = —1/x 2 . Since the second derivative is negative, we again have an 
overestimate. Now, when t ranges between a = 1 and x = 0.99, how big could 
= 1/t 2 be? Again, this is decreasing in so the biggest value occurs 
when t = 0.99. So we have M = 1/(0.99) 2 , and our error estimate looks like 
this: 


20000(0.99) 2 


This simplifies to about 0.000051, which is really tiny. This means that —0.01 
is a very good approximation to ln(0.99). More precisely, we’ve proved the 
inequalities 


- 0.01 - 


20000(0.99) 2 


< ln(0.99) < -0.01 + 


20000(0.99) 2 


In fact, since —0.01 is an overestimate, we can once again tighten up the 
right-hand side and write that 

-0.01- 一八 A 一 o < ln(0.99) < -0.01. 

20000(0.99) 2 _ 、 J - 

We’ve narrowed down the value of ln(0.99) to lie in a really tiny interval. 

We’re going to return to the topic of finding approximations and estimating 
errors when we look at Taylor series in Chapter 24. There we’ll use not only 
the first derivative, but the second and higher derivatives to get even better 
approximations. 


13.3 Newton's Method 

Here’s another useful application of linearization. Suppose that you have an 
equation of the form f{x) = 0 that you’d like to solve, but you just can’t 
solve the darned thing. So you do the next best thing: you take a guess at a 
solution, which you call a. The situation might look something like this: 
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as in Section 13.2.1 above. To find the ^-intercept, set L(x) = 0; then we 
have / ⑷ + f f (a)(x — a) =0. Solving for a:, we get 


x = a — 


/ ⑷ 

7 >)' 


Since we called the ar-intercept 6, we have found the following formula: 


Newton’s method: suppose that a is an approximation 
to a solution of f(x) = 0. If you set 


b = a — 


/ ⑷ 
7W 


then a lot of the time 6 is a better approximation than a. 


It doesn’t work all the time, so I put in the phrase “a lot of the time” to cover 
my ass. We’ll come back to this detail on the next page. First, let’s look at 
some examples. Suppose that 

f(x) = x 5 -\-2x — 1 


and we’d like to find a solution to the equation f(x) = 0. Does it even have 
one? Since / is continuous, /(0) = —1 (negative), and /(l) = 2 (positive), 
the Intermediate Value Theorem (see Section 5.1.4 in Chapter 5) shows that 
f has at least one solution. On the other hand, f f (x) = 5x 4 H- 2, which is 
always positive; so / is always increasing, which means that the equation 
f(x) = 0 has at most one solution. (See Section 10 in Chapter 10.1.1 to 
remind yourself about this.) We have shown that / has exactly one solution. 
Let’s approximate the solution as 0. We know that /(0) 二 一 1, which isn’t 
very close to 0. No problem, just use Newton’s method with a = 0: 


/ ⑷ / ⑼ 0 5 +2(0)-l 1 

m— /'(0)_ 5(0)4+ 2 - 2 - 


Sob = 1/2 should be a better approximation than 0. Indeed, you can calculate 
that /(1/2) = 1/32, which is quite close to 0. What’s to stop us repeating 
the method and getting an even better solution? Nothing! Let’s now take 
a = 1/2 instead, and repeat: 


h _ n f(a) _ 1 /(1/2) _ 1 1/32 _ 18 

— /' ⑷ / 7 (l/2) _ 2 37/16 — 37. 

(Here we used the calculation /’(1/2) = 5(l/2) 4 + 2 = 37/16.) Anyway, this 
means that 18/37 should be an even better approximation to the true zero 
of /. If you calculate /(18/37), you’ll get something close to 0.0002, which is 
pretty darned small. The number 18/37 is really a pretty good approximation 
to the true zero of /. 










290 • Optimization and Linearization 


It might seem confusing to reuse a and b like this. A way around it is 
to use xo as the initial guess and x\ as the first improvement; then X 2 is the 
second improvement, starting with and so on. The formula can now be 
written like this: 


f(x 0 ) 
/’ ㈣’ 


xs=x 2 - 


/ ㈣ 

尸(^2) ’ 


and so on. 


Here’s another example. To find an approximate solution of the equation 
x = cos(a:), first set f(x) = x — cos (a:). If we can estimate the zero of /, then 
the same number will be an approximate solution oi x = cos(x). (We already 
used this trick in Section 5.1.4 of Chapter 5.) Let’s make the guess xo = tt/ 2; 
then /( 兀 /2) = n/2 — cos(7r/2) = 7r/2. That’s a pretty lousy guess. Never 
mind; since f(x) = 1 + sin(rr)，we have /’(7r/2) = 1 + sin(7r/2) = 2. This 


So x\ 


_ f{x 0 ) _ 7T 

$1 = $0 - 777~~7 = 77 

f r (x 0 ) 2 

7r/4 is a better approximation; indeed, /(7r/4) works out to be the 


7r/2 _ 7T 
~ = 4' 


quantity 7r/4 — l/\/2, which is about 0.08. Now repeat: 

f{xi) 7T /(7r/4) 7T 7 t/4 — 1/v^ 

X2 = Xl -rW) = A-rm = 4 - i + i/V2 5 

since /’(7 t/ 4) = 1 + sin(7r/4) = 1 + l/y/2. The above equation simplifies to 


= \ + + ^ = C 1 + tt/4)(\/2- 1), 



which is actually a little less than 7r/4. Also, /($ 2 ) turns out to be about 
0.0008. This means that x — cos(a ： 2 ) is about 0.0008, so the number X 2 above 
is a pretty good approximation to the solution of the equation x = cos(x). 
Of course, we could repeat the method to find an even better approximation 
xs, but the calculations become horrible. Computers and calculators are 
very good at it, though, and in fact often use Newton’s method to give good 
approximations. (Remember, a calculator only gives approximations! Even 
10 or 12 decimal places is still not exact, although it’s close enough in most 
situations.) 

As we noted before (but never explained), sometimes Newton’s method 
doesn’t work. Here are four different things that could go wrong: 

1. The value of f’(a) could be near 0. Clearly, if 


/ ⑷ 

尸⑷， 


then / 7 (a) can’t be 0 or else b isn’t even defined. In that case, the tangent 


line si x = a doesn’t even intersect the a;-axis, since it’s horizontal! Even 
if / 7 (a) is close, but not equal to 0, Newton’s method can still give a 
whacked-out result; for example, check out this picture: 
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16, and so on. These are just getting farther and farther away from the 
correct value 0. There’s not much you can do with Newton’s method if 
this sort of thing happens. 

4. You might get stuck in a loop. It’s possible that your estimate a 
leads to another estimate 6, which then leads back to a again. This 
means that there’s no point in repeating the process, as you just keep 
going around in circles! Here’s how the situation might look: 



◎ 



The linearization at x = a has ^-intercept 6, and the linearization at 
x = b has ^-intercept a, so Newton’s method just doesn’t work. A 
concrete (but messy) example is 

f(x) = ^x 2 — tan-i(:r). 

If you start with a = 1,1 leave it to you to show that b = —1. Since / is 
an odd function, it’s now clear that restarting with —1 leads to 1 again. 
It’s pretty unlucky to encounter a loop! Just try some other starting 
guess. (By the way, the study of these sorts of loops leads to a nice 
type of fractal that you might have seen as a screensaver on someone’s 
computer . …） 







CHAPTER 14 


L'HSpital’s Rule and Overview of Limits 

We’ve used limits to find derivatives. Now we’ll turn things upside-down and 
use derivatives to find limits, by way of a nice technique called l’H6pital’s Rule. 
After looking at various varieties of the rule, we’ll give a summary, followed 
by an overview of all the methods we’ve used so far to evaluate limits. So, 
we’ll look at: 

• rHopitaPs Rule, and four types of limits which naturally lead to using 
the rule; and 

• a summary of limit techniques from earlier chapters. 

14.1 VfHopitals Rule 

Most of the limits we’ve looked at are naturally in one of the following forms: 

lim lim{f(x)-g(x)), lim f(x)g(x), and lim/(a;) 9(x) . 

Sometimes you can just substitute x = a and evaluate the limit directly, 
effectively using the continuity of / and g. This method doesn’t always work, 
though — for example, consider the limits 

1 ^(^) ， + 血⑻， and limCl + Stan^))^. 

In the first case, replacing a: by 3 gives the indeterminate form 0/0. The 
second limit involves the difference between two terms which become infinite 
as a: — 0. Actually, they both go to 00 as a; ^ 0+ and —00 as a: — 0 一 ， so 
you can think of the form in this case as 士 (00 — 00 ). As for the third limit 
above (involving xln(a;)), this leads to the form 0 x (— 00 )， remembering that 
ln(a:) —> —00 as a: — 0+. Finally, the fourth limit looks like I 00 , which is also 
problematic. Luckily, all four types can often be solved using PHopitaPs Rule. 

It turns out that the first type, involving the ratio f(x)/g(x), is the most 
suitable for applying the rule, so we’ll call it “Type A.” The next two types, 
involving f(x) — g{x) and f(x)g(x), both reduce directly to Type A, so we’ll 
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call them Type B1 and Type B2, respectively. Finally, we’ll say that limits 
involving exponentials like / ⑷ 5 ( x ) are Type C, since you can solve them 
by reducing them to Type B2 and then back to Type A. Let’s look at all 
these types individually, then summarize the whole situation in Section 14.1.6 
below. 


14.1.1 Typo A: 0/0 case 

Consider limits of the form 


lim®, 

9{x) 

where / and g are nice differentiable functions. If g(a) ^ 0, everything’s 
great — you just substitute x = ato see that the limit is f(a)/g(a). If g(a) = 0 
but /(a) 7^ 0, then you’re dealing with a vertical asymptote a,t x = a and the 
above limit is either oo, —oo or it doesn’t exist. (See page 59 for graphs of 
the four possibilities that can arise in this case.) 

The only other possibility is that / ⑷ = 0 and g(a) = 0. That is, the 
fraction f(a)/g(a) is the indeterminate form 0/0. The majority of the limits 
we’ve seen have been of this form. In fact, every derivative is of this form! 
After all, 

f(x + fe) - f(x) 

h ， 


f (pc) = lim 


and if you put ft = 0 in the fraction, you get 0/0. So let’s concentrate on the 
case where /(a) = 0 and g(a) = 0. 

Here’s the basic idea. Since / and g are differentiable, we can find the 
linearization of both of them at x = a. In fact, as we saw in the previous 
chapter, if x is close to a, then 

f(x) ^ f(a) + f(a)(x-a) and g{x) ^ g{a) + g'{a)(x - a). 

Now, we’re assuming that f(a) and g(a) are both zero. This means that 
f(x) = f{a)(x — a) and g(x) = g\a)(x — a). 


If you divide the first equation by the second one, then assuming that x ^ a, 
we get 

f(x) ^f(a)(x-a) _f(a) 

9{x) ~ g'(a)(x - a) g'{a )' 

The closer a; is to a, the better the approximation. This leads* us to one 
version of PHopitaPs Rule: 


if/w=3(a)=o ， then 記葉 鑛 


provided that the limit on the right-hand side exists. (Actually, there’s 
another condition as well: g r {x) can’t be 0 when x is close to, but not equal 


*We haven’t actually proved l’H6pital’s Rule here; see Section A.6.11 Appendix A for 
a real proof. 














Section 14.1.1: Type A: 0/0 case • 295 


to, a. You have to be really unlucky for this to be a problem, though!) It’s 
really important that f(a) and g(a) are both zero, or else everything could 
get screwed up. 

Let’s apply the rule to an example from the beginning of the chapter: 


Notice that if you put a: = 3, then both top and bottom of the fraction are 0. 
This means we can use PHopitaPs Rule. All you have to do is differentiate the 
top and bottom separately — don’t use the quotient rule! The solution looks 
like this: 

v X 2 -9 PH 2x 
lim -- = lim — = 6. 

x — 3 x—^3 1 

Notice how there’s a little “l’H” above the equal sign to show that we’re using 
PHopitaPs Rule. By the way, you don’t need to use PHopitaFs Rule here — you 
can just factor a: 2 — 9 as (a: — 3) (a: + 3)，like this: 


x 2 -9 

lim -- = 

x—^s x — 3 


{pc - 3)(a:+ 3) 


=lim (a: + 3) = 3 + 3 = 6. 


Hey, we got the same answer! That’s a relief. 

Here’s a harder example where the factoring trick doesn’t work: 


lim 

x—^0 


x — sin ⑷ 


If you put x = 0, then both top and bottom are 0. The principle that 
sin ⑷ behaves like x for small x is useless in this case, since we’re taking the 
difference of the two quantities. So let’s apply PHopitaPs Rule, differentiating 
x — sin (a:) and x s separately: 


lim X —t ⑷ 1 

x—^0 cc—0 


cos(a:) 


3a: 2 


We actually saw how to solve the right-hand limit (without the 3 on the 
bottom) in Section 7.1.2 of Chapter 7. There we used the trick of multiplying 
top and bottom by l+cos(a:). There’s an easier way: just notice that the right- 
hand limit is also of the form 0/0 when you replace a; by 0 (since cos(0) = 1), 
so we can use PHopitaPs Rule again! We get 


lim - 

x—^0 


sin ⑷ i， H 


a: 3 


lim - 

x—^0 


• cos(a:) 


3W 


lim 学. 

x—6a; 


We could actually use PHopitaPs Rule once more to find the final limit, but 
a better way is to write 

ox 6 n x 6 6 

(Here we used our classic trig limit which we proved in Section 7.1.5 of Chap¬ 
ter 7.) All in all, we have proved that 

.. x — sin ⑻ 1 

x3 =6 - 
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◎ 


Before we move on to the next variation, here’s a nice observation. Way 
back in Section 6.5 of Chapter 6, we saw that some limits can be thought of 
as derivatives in disguise. For example, we worked out 


^0 


^32 + /i - 2 
h 


by the tricky technique of setting f(x) = then finding writing it as 

a limit, and finally putting x = 32. (Check back to see the details.) The point 
is that PHopitaFs Rule actually makes all these shenanigans unnecessary! For 
example, since the above limit is of the indeterminate form 0/0, we can find it 
by differentiating the top and bottom with respect to h. First write v^32 + h 
as (32 + /i) 1 / 5 ; then we have 




巧 _ 2 = lim ( 32 + / f 5 - 2 紗 2+ ，)_ 4/5 = 1 ( 32 )^% 


lim - 

h-*o 


which works out to be 1/80. This agrees with the answer we found previously. 
Now you should go back and look at the other examples in Section 6.5 of 
Chapter 6 and try using l’H6pital’s Rule on them instead. 


14.1.2 f, 秦 A: 士 oo/ 士 oo C0S® 


◎ 


◎ 


L’H6pital’s Rule also works in the case where \im a f(x) = oo and \im a g(x) = oo. 
That is, when you try to put a: = a, the top and bottom both look infinite, 
so you are dealing with the indeterminate form oo/oo. For example, to find 

3a: 2 + 7x 
2a: 2 - 5 5 

you could note that both top and bottom go to oo as x 
PHopitaPs Rule: 


then use 


3a: 2 + lx ph Qx-\-7 

™ 2^-5 = ^ 


lim 


6 + 


The term 7/4：x goes to 0 as a; —> oo, so the limit is 6/4, which is just 3/2. Of 
course, you could just have used the methods of Section 4.3 of Chapter 4 to 
find the limit; try checking that you still get 3/2 using those methods. 

Here’s another example. To find 


lim 

^o+ 


csc(a:) 


In ⑷’ 

notice that as a: — 0+, both the numerator and the denominator tend to 
oo. Why? Well, sin(a:) goes to 0 as a: — 0， so csc(ar) blows up; and also 
ln(a:) —oo as a: ^ 0 + , so 1 — ln(a:) ^ oo. Now use PHopitaPs Rule: 


lim - 

c _，o + 1 — m{x) 


lim 


- csc(x) cot (a:) 
— l/x 


sin(a:) tan (: r) • 


To find the limit, write it as 







We have 
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sin(x) sin(x) 




but for the other factor we have 


lim - = 00 , 

tan ⑻ 

since tan ⑷ — 0+ as a; — 0+. So we have proved that 
csc ⑷― 


lim 

x—^0+ 


In ⑻ 


The rule also applies as a: — 00 , as we saw above. Here’s another example: 


lim 




lim 




0. 


The last limit is 0 because e x — 00 as a: — 00 . Also, the justification for 
using PHopitaPs Rule is that both x and e x go to 00 as a: ^ 00 . Notice that 
the denominator e x was unscathed by the differentiation, but the numerator 
x was knocked down to 1. This is even clearer when you consider the example 


Just use PHopitaPs Rule three times, noting that in each case we are dealing 
with the indeterminate form oo/oc: 


lim f 坚 lim 坚 lim 

x—*-oo E x x-^-oo e, x x-^-oo 




lim = 0. 


Of course, the same technique applies to any power of x; you just have to 
apply the rule enough times, knocking the power down by 1 each time, while 
the e x just sits there like some immovable lump. So we have proved the 
principle that exponentials grow quickly, which is discussed in some detail in 
Section 9.4.4 of Chapter 9. 

Now, a gentle reminder: please, please, please check that you have an 
indeterminate form! The only acceptable forms for a quotient are 0/0 or 
士 00 / 士 oo. For example, if you try to use FHopitaPs Rule on the limit 

r x2 

l in i - 

^0 cos ⑻ 

you’ll get into a real tangle. Let’s see what happens: 

2a: 


lim '= lim . 

X—0 COS ⑻ 


- sin(a:) 


This is clearly wrong, since x 2 and cos(a:) are both positive when x is near 0. 
In fact, the correct solution is 

lim^- = -4r = ?=0. 
cos (: r) cos(0) 1 


L’H6pital’s Rule can’t be used here since the form is 0/1, which is not inde¬ 
terminate. So be careful! 
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14.1.3 Type B1 (oc — oc) 

Here’s a limit from the beginning of this chapter: 


◎ 


◎ 


As $ — 0+, both 1/ sin ⑷ and 1/x go to oo. As a; — 0 一， both quantities go 
to —oo. Either way, you’re looking at the difference of two huge (positive or 
negative) quantities, so we can express the indeterminate form as 士 (oo — oo). 

Luckily, it’s pretty easy to reduce this to Type A. Just take a common 
denominator: 

:- sin(a:) 


I 1 ™ (^) _ 臺 ) = 


Now you can put x = 
FHopitaPs Rule: 


0 and ； 


;)= 


x-^o x sin(a:) 
that we are in the 0/0 case. So we can apply 

x — sin(a:) i，h r 1 — cos ⑷ 
o x sin (a;) sin(x) + x cos(x) * 


Notice that we used the product rule to differentiate the denominator. In any 
case, we are again in 0/0 territory ― just put a: = 0 and see that the top and 
bottom both become 0. So we use PHopitaPs Rule (and the product rule) 
once more: 


— COS ⑷ 1，H 


+o sin(a:) + x cos(a:) 


=lim - 

x—^0 I 


sin ⑻ 


ios(a:) + cos(a:) — x sin ⑷ 


Don’t use PHopitaPs Rule again! At this stage, just put x = 0; the numerator 
is 0 and the denominator is 2, so the overall limit is 0. Putting everything 
together, we have shown that 


i 洩 (▲_ 臺 ) =0 . 


Taking a common denominator doesn’t always work. Sometimes you might 
not even have a denominator at all, so you have to create one out of thin air. 

For example, to find _ 

lim {yx-\- ln(x) — y/x), 


first note that as a: ^ oo, both ^/x ln(x) and y/x go to oo; so we are in the 
oo — oo case. There’s no denominator, so let’s make one by multiplying and 
dividing by the conjugate expression: 


lim (ya: + ln(a:) — \fx) = lim (y/x ln(a:) — y/x) x 


^/x + ln(a:) + y/x 


\Jx + In ⑷ + 

Using the difference of squares formula (a — b)(a + 6), this becomes 
ln(a:) — x In ⑷ 


lim 


_ = lim _ . 

yjx + In ⑷ + y/x x ^°° y a: + ln(a:) + y/x 
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Now we are in the 00/00 case of Type A, so we just differentiate top and 
bottom (using the chain rule on the bottom) to see that 


lim 


In ⑷ 

y/x + 1咖) + y/x 


TH 

=lim 

cc—00 


l/x 

1 + 1/x 1 ’ 

2y/x + ln(x) 


If you multiply the top and bottom of the fraction by x, you get 

lim -- -=. 

x—oo X H- 1 yjx 

2^/x + ln(x) 2 


We’re almost done, but we do need to take a little look at what happens to 
the first fraction in the denominator as 工 — 00 : 


i. u, -r ±. 

lim — —- 

^°° 2yjx + ln(x) 


This is also an 00/00 indeterminate form, so whack out another application 
of ye olde FHopitaPs Rule: 


lim 


2^/x ln(x) 


in 

=lim 


2(1 + l/x) 
2^/x + \n(x) 


=lim 


y/x + In ⑷ 

1 + 1/a: 


As a; ^ 00 , the denominator 1 + l/x goes to 1 but the numerator H- ln(x) 
goes to 00 . This means that 


lim 

X—00 


X -\-l 


= 00 . 


Returning to our original problem, we have already found that 


lim (\/x + ln(x) — y/x) = lim 

x—^00 x—^00 


X~\~l y/x 

2y/x H- In ⑷ 2 


Both fractions in the denominator go to 00 as a; 00 , so the limit is 0. 

Unfortunately, it’s not always possible to use rHopitaPs Rule on type B1 
limits. In fact, the only time it can actually work is when you’re able to 
manipulate the original expression to be a ratio of two quantities, as in the 
above examples. 


14.1.4- 'Type B2 (0 x 士 oo) 



Here’s a limit we’ve looked at before, in Section 9.4.6 of Chapter 9 as well as 
at the beginning of this chapter: 

lim + a:ln(a:). 


The limit has to be as a: —> 0+ since \n(x) isn’t even defined when x < 0. 
Now, as a: ^ 0+, we see that a: — 0 while ln(a:) ^ — 00 , so we are dealing 
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with the indeterminate form 0 x (— 00 ). Let’s turn the limit into Type A by 
manufacturing a denominator. The idea is to move x into a new denominator 
by putting it there as 1/x: 


lim arln(a:) = lim 〒 ( 尸 ) • 

cc—0+ x—^0+ X 


Now the form is — 00 / 00 , so we can use PHopitaPs Rule: 




l，H T . I/X 


We can simplify the fraction on the right to —x^ so that the overall limit is 


lim (—a;) = 0. 

X—J-0+ 


We’ve solved the problem, but let’s just check out something: why did we 
move x into the denominator and not In (a:)? It’s true that 

lim xln(x) = lim —rr—r^- 
x^o+ x^o+ l/ln(x) 


Now you have to differentiate l/ln(a;) instead, which is much harder. If you 
try it, you’ll see that 

^UlnOr) = (1/a;)( _ 1 1 /(lll(a;))2) = 


◎ 


This is actually worse than the original limit! So, take care when you choose 
which factor to move down the bottom. As you can see from the above 
example, moving a log term can be a bad idea — so avoid doing that. 

Here’s another example: 


lim 

X—^7r/2 


卜 - 吾 ) tan(a;)_ 


When you put x = 7r/2, the first factor (x — 7r/2) is 0, while the tan(x) factor 
is either 00 (as a: ^ or —00 (as x (7r/2) + ). Sketch the graph of 

y = tan (: r) to make sure you believe this. In any case, we can move the tan(ar) 
factor down to a new denominator by putting it there as 1/ tan(a:), or cot ⑷. 
That is, 

lim (x — tan—) = lim - ~~ 
x—^tt/2 \ 2 / x—^ir/2 COt(iC) 

This is a lot easier than putting the (x — 7r/2) term in the denominator —— in 
fact, that doesn’t even work. Anyway, the above limit is now in 0/0 form, so 
you can use rHopitaPs Rule: 

lim (x — tan(a:) = lim - ~~ l = lim - ^ • 

x^n/2 \ 2 / x^tt/ 2 cot (a：) x^tt /2 (— CSC^( 0 ：)) 

Since sin(7r/2) = 1, we see that also csc(7t/2) = 1 ， so the above limit is —1. 
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14 15 T^peC (1 ±0 °, 0°, or oo°) 



Finally, the trickiest type involves limits like 



where both the base and exponent involve the dummy variable (x in this case). 
If you just put x = 0, you get 0°, which is another indeterminate form. To find 
the limit, we’ll use a technique very similar to logarithmic differentiation (see 
Section 9.5 in Chapter 9). The idea is to take the logarithm of the quantity 
fi rs t，and work out its limit as a: ^ 0 + : 

lii^+ln(a: sin ⑻). 


By our log rules (see Section 9.1.4 of Chapter 9)，the exponent sin(a:) comes 
down out front of the logarithm: 


lim + ln(a; sin ( x )) = lim + sin(a;) ln(a:). 


As : r — 0+， we have sin(a;) —> 0 and ln(x) —> —oo, so now we’re dealing 
with a Type B2 problem. We can put the sin(x) into a new denominator 
as 1/ sin(a;), which is just csc(x), then use PHopitaPs Rule on the resulting 
Type A problem: 

lim sm(x)Hx)= lim 笔② lim ^ 

X-J-0+ a;-J-0+ CSC ⑻ X-J-0+ — CSC ⑻ COt ⑻ 

This can be rearranged to 

lim+ — 血 ⑷ x tan ⑷ = —1x0 = 0. 

Are we done? Not quite. We now know that 

lim + ln(a: sin ^) = 0; 


◎ 


so now we just have to exponentiate both sides to see that 
lim 户⑷ =e & = 1. 

x—^0+ 

(The exponentiation works because e x is a continuous function of x.) 

Let’s review what we just did. Instead of finding the original limit, we 
took logarithms and then found that limit, using the Type B2 technique. 
Finally, we exponentiated at the end. 

In fact, sometimes you don’t even have to go through the Type B2 step 
on your way to Type A. For example, to do 


lim(l + 3 tank)) 1 /® 


from the beginning of the chapter, first note that we are dealing with the form 
l ±oc . So take logarithms: 

lim In ^(1 + Stanb)) 1 ’ 21 ) = lin^l ln(l + 3tan(a:)) = lim l n (l + 3tan(x)) • 
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This is now of the form 0/0, so it’s already a Type A limit. By the chain rule, 
we have 

3 sec 2 ⑷ 

lim ln(l + 3tan(x)) i 卫此 1 + 3 tan(x) _ 3 ⑴ 2 _ ^ 
x —x 尤 ― o 1 1 + 3(0) 

We have now shown that 

lim In ^(1 + 3tan(a:)) 1 / x ^ = 3. 

Exponentiate both sides to get 

lim(l H- Stan^)) 1 / 37 = e 3 . 

There is one more indeterminate form of this type, oo°. An example is 
Jim x~ 1/x , 

since —l/;r — 0 as x — oo. The same trick still works: take logarithms and 
use the Type A methodology to get 

lim Mx~^= lim 埤 =0. 

x—^<x> x—^oo — X x—^cx> — 1 

Now exponentiate to get 

lim x~ 1/x = e° = 1. 

It’s not really necessary to learn that the only indeterminate forms involving 
exponentials are l( ±oc ) ， 0°, and oo°. You see, if you have any limit involving 
exponentials, you can always use the above logarithmic method to convert 
everything to a product or quotient, then work out the new limit L. The 
actual limit will just be e L . The only exceptions are that if L = oo, then you 
have to interpret e°° as oo; and if L = —oo, then you need to recognize e~°° 
as 0. This is consistent with our limits 

lim e x = oo and lim e x = 0 

x—^oo x—^—oo 

from Section 9.4.4 of Chapter 9. 

M.1 6 Surmmaryof 1 HQprtai's Rule types 



Here are all the techniques we’ve looked at: 


• Type A: if the limit involves a fraction, like 


lim 


f(x) 


check that the form is indeterminate. It must be 0/0 or 士 oo/ 士 oo. 
Use the rule 


lim 


f(x) 

9{x) 


lim 


m 

9'{x) • 


Do not use the quotient rule here! Now, solve the new limit, perhaps 
even using PHopitaPs Rule again. 
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• Type Bl: if the limit involves a difference, like 

- g(x)), 

where the form is 士 (oo — oo), try taking a common denominator or 
multiplying by a conjugate expression to reduce to a Type A form. 

參 Type B2: if the limit involves a product, like 

where the form is 0 x 士 oo, pick the simplest of the two factors and put 
it on the bottom as its reciprocal. (Avoid picking a log term — keep that 
on the top.) You get something like 

lim f{x)g{x) = lim /(:)、 . 
x—^a x^a \/j[x) 

This is now a Type A form. 

• Type C: if the limit involves an exponential where both base and expo¬ 
nent involve the dummy variable, like 

then first work out the limit of the logarithm: 

lim \n(f(x) 9 ^) = lim g(x) ln(/(a:)). 

x—^a x—^a 

This should be either Type B2 or Type A (or else it’s not indeterminate 
and you can just substitute). Once you’ve solved it, you can rewrite the 
equation as something like 

Inn ln(/(a;) 9 ( x )) = L, 

then exponentiate both sides to get 

lim f(x) 9 ^ = e L . 

Now all that’s left is for you to practice doing as many PHopitaPs Rule prob¬ 
lems as you can get your hands on! 

14.2 Overview of Limits 

It’s time to consolidate. Here’s a brief summary of all the techniques we’ve 
seen so far involving evaluating limits. The following techniques apply to 
limits of the form 

lim F(x)^ 

where F is a function which is at least continuous for x near a, but maybe 
not dX x = a itself. Also, a could be oo or — oo. So, here’s the summary: 
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• Try substituting first. You might be able to evaluate the limit. 

• If your substitution leads to b/oo or 6/(—oo), where b is some finite 
number, then the limit is 0. 

• If the substitution gives 6/0, where 6 — 0, then you’re dealing with a 
vertical asymptote. The left-hand and right-hand limits must be oo 
or —oo, and the two-sided limit either doesn’t exist (if the left-hand and 
right-hand limits are different) or is one of oo and —oo. Use a table of 
signs around x = a to investigate the left-hand and right-hand limits. 
(Also see Section 4.1 in Chapter 4.) 

• If none of the above points are relevant, and your limit is of the form 
0/0, try seeing if it is a derivative in disguise. If you can rewrite it 
in the form 

f(x + h)~ f(x) 
h 


for some particular function and possibly a specific number then the 
limit is just f’(x). As we saw in Section 14.1.1 above, these sorts of 
problems can also be done by using PHopitaPs Rule. (See also Section 6.5 
in Chapter 6.) 

• If square roots are involved, multiplication by a conjugate expression 
might help. (See Section 4.2 in Chapter 4.) 

• If absolute values are involved, convert them into piecewise-defined 
functions using the formula 


1^41 = 



if A >0, 
if A <0. 


Remember to replace all five occurrences of A above with the actual 
expression you’re taking the absolute value of! (See Section 4.6 in Chap¬ 
ter 4.) 

• Otherwise, you can use the properties of various functions which can pop 
up as ingredients in your main function. Remember that “small” means 
“near 0，” and “large” can mean large positive or negative numbers. (See 
Section 3.4.1 in Chapter 3.) Beware: if your limit is as x ^ oo, it doesn’t 
necessarily mean that you are in the large case. For example, sin(l/a:) 
involves the sine of a small number as a: — oo, since 1/x 0. The same 

warning applies to limits as a: —> 0, which need not be in the small case. 
Anyway, here’s the deal for polynomials, trig functions, exponentials, 
and logs: 

1. Polynomials and poly-type functions: 

— General tip: try factoring, then cancel common factors. (See 
Section 4.1 in Chapter 4.) 

— Large arguments: the largest-degree term dominates, so 
divide and multiply by that term. (See Section 4.3 in Chap¬ 
ter 4.) 
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2. Trig and inverse trig functions: 

— General tip: know the graphs of all the trig and inverse trig 
functions, and their values at some common arguments. All the 
stuff in Chapter 2 and Chapter 10 is helpful in this regard. 

— Small arguments: sin(A) behaves like A when A is small, 
so divide and multiply by A. The same goes for tan(A), but 
not cos(A): that just behaves like 1. This technique is useful 
when only products and quotients are involved. It probably 
won’t work when the trig function is added to or subtracted 
from some other quantity. (See Section 7.1.2 in Chapter 7.) 

— Large arguments: for sine or cosine, use the facts that 
I sin (any thing) I < 1 and |cos(anything)| < 1 

in conjunction with the sandwich principle. (See Section 7.1.3 
in Chapter 7.) Some other useful facts are 

lim tan _1 (x) = and lim tan -1 (a:) = — = • 

X^OO 、 ’ 2 x^-oo 、 ’ 2 

(Informally, you can think of these as tan _1 (oo) = 7r/2 and 
tan _1 (—oo) = —7r/2, but make sure you understand that these 
are just crude ways of expressing the limits above.) 

3. Exponentials: 

— General tip: know the graph oiy = e x , and learn the limits 
lim (1 + hx) 1 ^ = e x and lim fl + —= e x . 

h^O n—^oo \ n/ 

(See Section 9.4.1 in Chapter 9.) 

— Small arguments: since e° = 1, you can normally just isolate 
any factors which involve the exponential of a small number 
and replace them by 1 when you take the limit. The exception 
is when sums or differences occur; then you might want to use 
PHopitaPs Rule, or perhaps the limit is actually a derivative in 
disguise. (See Section 9.4.2 in Chapter 9.) 

— Large arguments: learn the important limits 

lim e x = oo and lim e x = 0. 

x—*-oo x—^—oo 

(For substitution purposes only, you can think of these limits as 
e°° = oo and e~°° = 0, even though these equations aren’t for¬ 
mally true.) Also remember that exponentials grow quickly 
as a: —> oo. This means that 

The base e could instead be any number bigger than 1, and 
the exponent x could instead be some other polynomial with 
positive leading coefficient. (See Section 9.4.4 in Chapter 9.) 
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(or, as a memory aid only, ln(0) = —oo). Also, logs “grow’ 



for any a > 0, no matter how small. (See Section 9.4.6 in 
Chapter 9.) 



lim ln(x) = oo, 

cc—oo 


which has the informal abbreviation ln(oo) = oo. Nevertheless 



for any polynomial of positive degree. (See Section 9.4.5 in 



• If none of the above techniques work, consider using PHopitaPs Rule (see 
Section 14.1.6 above for a summary). If you do, you’ll always get a new 
limit to solve, which you can attack using any of the above principles or 
PHopitaPs Rule once again. 

All these facts and methods above are just tools to help you solve limits. 
They may not work on every limit you see — in fact, we’ll be looking at a 
completely different type of limit problem in Chapter 17 — but they should 
help with a heck of a lot of them. There’s an art to knowing which tool to 
use, and of course, practice makes perfect. So go forth and evaluate limits! 






CHAPTER 15 — 

Introduction to Integration 


So far as calculus is concerned, differentiation is only half the story. The 
other half concerns integration. This powerful tool enables us to find areas of 
curved regions, volumes of solids, and distances traveled by objects moving at 
variable speeds. In this chapter, we’ll spend some time developing the theory 
we need to define the definite integral. Then, in the next chapter, we’ll give 
the definition and see how to apply it. So here’s the plan for the preliminaries 
on integration: 

• sigma notation and telescoping sums; 

• the relationship between displacement and area; and 

• using partitions to find areas. 


il.l Sigma 啤瓣 ion 


Consider the sum 

111111 
I + 4 + 9 + 16 + 25 + 36' 

This is not just a sum of random numbers: there’s a definite pattern. The 
terms in the sum are reciprocals of the squares from l 2 through 6 2 . Here’s a 
more convenient way to write the sum: 



To read it out loud, say “the sum, from j = 1 to 6, of 1/j 2 .” Now, here’s 
how it actually works. The idea is that you plug j — 1, j = 2, j = 3, j = 4, 
j = 5, and finally j = 6 into the expression 1/j 2 , one at a time, and then add 
everything up. We can tell that we’re supposed to start at j = 1 and end up 
at j = 6 by the symbols below and above the big Greek letter E (which is a 
capital sigma, hence the term “sigma notation”）. So we have 
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◎ 


◎ 

◎ 


Notice that we haven’t actually worked out the value of the sum! All we’ve 
done is abbreviate it. 

Now consider the following series (that’s another word for “sum”）in sigma 
notation: 


E 


The only difference between this sum and the previous one is that now we 
have to go to 1000, not 6. So 


^2 = 12 + 2^ + 32 


999 2 1000 2 * 


In this case, the sigma notation is particularly nice, avoiding the ” al¬ 
together (unlike the right-hand side of the above equation). Here’s another 
variation: 


E. 




硕十丽 • 


This sum starts at j = 5, not j = 1, so the first term is 1/5 2 . 

Sigma notation is also really useful when you want to vary where the sum 
stops (or starts). For example, consider the series 


n 1 


This starts at j = 1 and finishes at j = n, so we have 






(n-2) 2 (n-1) 2 


Notice that the second-to-last term occurs when j = n — 1, and the third-to- 
last term occurs when j = n — 2; I wrote those terms, along with the first 
three and the last term, on the right-hand side of the above equation. The 
other terms are all absorbed into the in the middle. 

In the sum 

n 1 

it looks as if there are two variables, j and n, but in reality there is only one: 
it’s n. You can easily see this by looking at the expanded form 




(n-2) 2 ^ (n-1) 2 ^^ 2 - 


There’s no j at all! In fact, j is a dummy variable — it’s just a temporary 
placeholder, called the index of summation, that runs through the integers 
from 1 to n. So we could even change it to another letter without affecting 
anything. For example, the following sums are all the same: 


Ej2=E^ = E^ = E^2 

j=l J k=l a=l a=l 
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By the way, this isn’t the first time we’ve seen dummy variables: limits also use 
them, so there’s nothing new here. (See the end of Section 3.1 of Chapter 3.) 
Let’s look at some more examples. What is 

200 

S 5 ? 


Don’t fall into the trap of saying that it’s equal to 5. Let’s look a little closer. 
When m = 1, we have a term 5. When m = 2, we again have 5. The same 
goes for m = 3, m = 4 and so on until m = 200. So in fact 

200 

〉: 5 = 5 + 5 + 5 + •… + 5 + 5 + 5, 


where there are 200 terms in the sum. So the value works out to be 200 x 5, 
or 1000. Similarly, consider the series 

1000 

Y, i =x+i#a + ••• + 1 + 1 + 1. 

g=100 

How many terms of 1 are there in this sum? You might be tempted to say 
that there are 1000 — 100, or 900, but actually there’s one more. The answer 
is 901. In general, the number of integers between A and B, including A 
and B, is B — A-\-1. 

How would you write 


sin(l) + sin(3) + sin(5) + • • • + sin(2997) + sin(2999) + sin(3001) 


in sigma notation? You might try 


3001 

sin ⑺， 
i=i 

but that’s no good: that would be 


sin(l) + sin(2) + sin(3) + • •. + sin(2999) + sin(3000) + sin(3001). 


We don’t want the even numbers. Here’s how you get rid of them. First, 
imagine that j steps through the numbers 1 ， 2, 3, and so on. Then the 
quantity (2j — 1) goes through all the odd numbers 1, 3, 5, and so on. So for 
our second try, let’s guess 

3001 

sin ( 2 i - i)- 


This is better, but there’s still a problem. When j gets to the end of its run, 
it’s at 3001, but (2j — 1) is then 2(3001) — 1 = 6001. This means that 


3001 

sin(2j—1) = sin(l)+sin(3)+sin(5)+. • -+sin(5997)+sin(5999)+sin(6001). 
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15.1 


We have too many terms! How do you know when to stop? At the end, we 
need sin(2j — 1) to be sin(3001), not sin(6001). So, just set 2j — 1 ^ 3001 ， 
which means that j = 1501. Finally, we have 

1501 

sin(l)+sin(3)+sin(5)+- • .+sin(2997)+sin(2999)+sin(3001) = [ sin(2j—1). 

j=i 

This is the correct answer. Make sure you agree with it by plugging in the 
values j = 1, j = 2, j = 3, and also j = 1499, j = 1500, and j = 1501. You 
should get the terms written out on the left-hand side above. On the other 
hand, the sum 

1501 

sin ( 2 ^) 

j=i 

expands as 


sin(2) + sin(4) + sin(6) + • •. + sin(2998) + sin(3000) + sin(3002). 

So you get the even numbers using 2j instead of (2j — 1). Of course, if you 
wanted multiples of 3, you’d use 3j. The possibilities are endless! 


1 A nice;sum 


Consider the sum 


Yj. 

i=i 


First, let’s expand the sum. When j = 1, we get 1. When j = 2, we get 2. 
This continues until j = 100; then we just add up all these quantities. So 


100 

= 1 + 2 + 3 + •••+98+ 99 + 100. 
i=i 


Yup, it’s the sum of the numbers from 1 to 100. Now, how about the sum 


99 

+ ”？ 

j=o 

When j = 0, we get 1; when j = 1, we get 2; and so on until j = 99, in which 
case we get 100. So in fact 


99 

YjJ + 1) = 1 + 2 + 3 + ••• + 98+ 99 + 100. 
j=o 


This is the same sum as before! What we’ve done is shift the index of sum¬ 
mation j down by 1. Now, consider this sum: 

100 

E ( 101 -办 
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When ji — 1, we get 100; when j = 2, we get 99; and so on until j = 100, in 
which case we get 1. That is, the numbers 101 — j march down from 100 to 
1， so 

100 

5^(101 - j) = 100 + 99 + 98 + … + 3 + 2 + 1. 

This is the same sum as before, just written backward. There are many ways 
of expressing any sum in sigma notation. 

In fact, this last way of writing the sum isn’t just a curiosity~~we can 
actually use it to find the value of the sum. Suppose that we let S be the sum 
1 + 2 + .. • + 99 + 100; then we have seen that 

100 100 
S = and also S = ^(101 — j). 

If you add up these two expressions, you get 
100 100 

2<s = Ej .+ E ( 101 -办 
i=i j=i 

In the first sum, the numbers increase from 1 to 100; in the second sum they 
decrease from 100 to 1. The nice thing is that you can add the numbers in 
any order and still get the same result. So we can combine the sums and write 

100 

25 = E(j + ( 101 -j))- 

j=i 

Since j + (101 — j) = 101, this just works out to be 
100 

2S = ^101. 
i=i 

There are 100 copies of the number 101, so we have 2S = 101 x 100 = 10100. 
This means that S = 10100/2 = 5050. We have shown that the sum of the 
numbers from 1 to 100 is 5050. Believe it or not, the great mathematician 
Gauss worked this out (using the same method) at the age of 10! 

15.1 為 '.'Telescoping series 

Check out the following sum: 

E(i 2 -(i-i) 2 )- 

J=1 

This expands fully to 


(12 _ 0 2) + (2 2 - l 2 ) + (3 2 - 2 2 ) + (4 2 - 3 2 ) + (5 2 - 4 2 ). 
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You can cancel a lot of the terms here. In fact, if you take a close look, you’ll 
see that everything cancels out except 5 2 —0 2 , so the sum is just 5 2 = 25. The 
same sort of thing happens even if you have a lot more terms. For example, 

200 

E(m) 


expands as 

(l 2 -0 2 ) + (2 2 -l 2 ) + (3 2 -2 2 )+...+(198 2 -197 2 ) + (199 2 -198 2 ) + (200 2 -199 2 ). 

Once again, everything cancels except for 200 2 —0 2 , so the sum is 40000. Wait 
a second, there doesn’t seem to be anything to cancel out the 3 2 or —197 2 
terms! Well, there are —3 2 and 197 2 terms hidden inside the “•••’’，so the 
cancelation does work. 

This sort of series is called a telescoping series. You can compact it down 
to a much simpler expression, just like collapsing one of those old spyglasses. 
In general, we have 


◎ 




For example, we have 
100 

( e cos(j) _ e cos(j-l)) = e cos(100) _ e cos(10-l) 


◎ 


which is simply e cos ( 10 °) - e cos ( 9 ). You just have to take the e cos ( J ) part and 
replace j by the last number (100)，then subtract the e cos ( J 一 工） part with the 
j replaced by the first number (10). You should try expanding the sum and 
check that the cancelation works. 

Here’s another example. To find 

j=i 

notice that the sum telescopes; so you just take (j 2 — (j — l) 2 ) and replace 
the first j by n, and the second j by 1, to see that 


Eo ' 2 -( j - 1 ) 2 ) = ri2 -( 1 - 1 ) 2 


On the other hand, the quantity j 2 — (j — l) 2 works out to be j 2 — (j 2 — +1), 

or just 2j — 1. So we have actually shown that 


E (幻 -!) = 
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If you think about it, the left-hand side is just the sum of the first n odd 
numbers. For example, when n = 5, the left-hand side is 1 + 3 + 5 + 7 + 9, 
which works out to be 25. Hey, that’s 5 2 exactly! If instead you take n = 6, 
then the left-hand side is 1 + 3 + 5 + 7 + 9 + 11, which is 36. This is 6 2 , so 
once again the formula works. We have proved that the sum of the first n odd 
numbers is n 2 . 

We can say even more, though. We can split up the sum like this: 

j=i j=i 

If you’re a little skeptical about this, then check out how it works for the first 
five terms. Instead of writing 1 + 3 + 5 + 7 + 9, we’re expressing the sum 
as (2 — 1) + (4 — 1) + (6 — 1) + (8 — 1) + (10 — 1), then rearranging to get 
(2 + 4 + 6 + 8 + 10) — (1 + 1 + 1 + 1 + 1). In fact, we can take out a factor of 
2 from the first sum and express it as 2(1 + 2 + 3 + 4 + 5). In terms of our 
equation above, this means that we can pull out the constant 2 from the first 
sum and get 

2 b-f 1 = n ' 

j=i i=i 

Stick the second sum on the right and divide by 2 to get 

IHd 1 ). 

The sum on the right-hand side is just n copies of 1, so it’s actually equal to 
n. So the right-hand side is (n 2 + n)/2, which can be written as n(n + 1)/2. 
We have proved the useful formula 


- + 1 ) 

E 卜 


When n = 100, this formula specializes to 


Ei = ^M = 505 o, 

3=1 

agreeing with what we saw in the previous section. 

Instead of starting with squares as we did in the previous example, let’s 
try starting with cubes: 

E(J 3 - U - l) 3 ) = n 3 - (1 - l) 3 = n 3 . 


Once again, finding the value of the sum is easy because it’s a telescoping 
series. In any case, you can do some algebra and see that j s — (j — l) 3 
simplifies to 3j 2 — 3j + 1. So the above sum becomes 


E(3j 2 - 3j + 1) =n 3 . 
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Let’s break the sum into three pieces and pull out some constants: 


3 Ej 2 - 3 Ej + E 1 = , 


Now put the last two sums on the right-hand side and divide by 3 to get 

n ' ( n n \ 

Ej 2 = 3 • 

j=i V 片 1 j =1 / 


The previous example shows that the first sum on the right-hand side works 
out to be n(n + 1)/2, while the second sum is again n copies of 1, which is n. 
So we have 



3n(n + 1) 
2 



A little algebra shows that the polynomial on the right-hand side can be 
simplified to (2n 3 + 3n 2 + n)/6, which factors to n(n + l)(2n + 1)/6. So we 
have proved that 


，2 _ l)(2n + 1) 

6 


Now we know how to add up the first n square numbers. For example, 


l2 +2 2 +3 2 + ... + 99 2 + 100 2 = (100)(101)(201) = 3 38350 

6 

Even Gauss might have had to wait until he was 11 years old to find that 
sum! 


'||.2 Displacement and Area 

Let’s move on from sigma notation, and spend some time investigating the 
following question: 

If you know the velocity of a car at every moment during some time 
interval, what is its total displacement over that time interval? 

In symbols, this means that we know the velocity v(t) at every time t in some 
interval [a, b ], and we want to find the displacement x(t). We already know 
how to do this the other way around: if we know x(t), then v(t) is just 
That is, velocity is the derivative (with respect to time) of displacement. In 
order to answer the reverse question, let’s look at some simple cases first. 

15.2.1 Three simple cases 

Consider three cars going in the forward direction along a long straight high¬ 
way. Since the cars are always going forward, we can work with speed and 
distance instead of velocity and displacement (respectively) — there’s no dif¬ 
ference in this case. Each of the cars leaves from the same gas station at 3 
p.m. and finishes the journey at 5 p.m. 
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The first car goes at a speed of 50 miles per hour the whole time. So 
v(t) = 50 for all t in the interval [3,5]. To work out the distance traveled in 
this case, just use the fact that distance = average speed x time. Luckily, the 
average speed v av and the instantaneous speed v are both equal to 50, since 
the speed never changes. So we get 

distance = vxt = 50 x2 = 100. 

That is, the car has gone 100 miles. Now, if we draw the graph of v against 
it looks like this: 


50 


3 


You can see a rectangle marked off between the solid line of the velocity at 
v = 50, the t-axis, and the vertical lines 亡 = 3 and t = 5. The height of the 
rectangle is the speed 50 (mph), while its base is the time taken, 2 (hours). 
The quantity 50 x 2 is the area of the rectangle (in miles, but let’s not get 
too bogged down about units for the moment). So in this case, the distance 
traveled is the area under the graph of v versus t. 

As for the second car, it goes at a speed of 40 mph for the first hour; then 
at 4 p.m. it starts going 60 mph. Ignoring the few seconds that it takes to 
accelerate, the graph of the situation looks like this: 



I’ve already shaded the area under the graph down to the t-axis between the 
lines t = 3 and t = 5, expecting this to be the distance. Let’s check it out. 
During the first hour, the car travels at 40 mph, so the distance traveled is 
40 x 1 = 40 miles. This is the area under the left-hand rectangle, which has 
height 40 (mph) and base 1 (hour). The same thing works for the second 






〕 gooa on tne wnoie journey unless you Know me average speea. wan;, 
say — the average speed here is obviously 50 mph, so there’s no problem! 

,that’s true, but let’s look at the third car and then see if you still feel 
same way. 

rhe third car travels at 20 mph for the first 15 minutes, then goes 40 mph 
14 p.m. At that time, it switches to 60 mph for half an hour, before 
;ing to the slower speed of 50 mph for the rest of the journey. Once again 
•ring the short accelerations and decelerations when the speed changes, 
graph of v against t looks like this: 


V 







60 







50 







40 







20 


— 

30 

30 

25 




5 






3 

1 l 

) t 


average speed isn’t obvious from looking at the graph. On the other 
i, we can work out the distance by breaking the 2-hour time interval into 
Her pieces corresponding to the four rectangles in the above graph: 

> From 3 to 3.25 (which is the way to write 3:15 p.m. in decimal hours), 
the car traveled at 20 mph, so the distance traveled was 20 x 0.25 = 5 
miles. That’s the area of the first rectangle above, since its height is 
20 mph and its base is 0.25 hours. 

» From 3.25 to 4, the speed was 40 mph, so the distance was 40 x 0.75, or 
30 miles. That’s the area of the second rectangle. 

» From 4 to 4.5 (that is, 4:30 p.m.), the car’s speed was 60 mph, so the 
distance was 60 x 0.5 = 30 miles — the area of the third rectangle. 

» Finally, from 4.5 to 5, the speed was 50 mph, so the distance traveled 
during that time was 50 x 0.5 = 25 miles, precisely the area of the fourth 
rectangle. 

during the four time periods, the car went 5, 30, 30, and 25 miles, respec- 
iy, as shown on the above graph; the total is therefore 5 + 30 + 30 + 25 = 90 
：s. Finally, we’ve found the distance the third car traveled! This means 
: its average speed was actually 90/2 = 45 mph, which isn’t even one of 
four speeds that the car went at. (This doesn’t violate the Mean Value 
orem because the function in the above scraDh isn’t differentiable.') 








Section 15.2.2: A more general journey • 31 7 


15.2.2 A more general journey 

Let’s look at a general framework to describe the sort of journey that the three 
cars made. Suppose that the time interval involved is [a, b) ; also, suppose that 
we can chop up this interval into smaller intervals so that the car is going at a 
constant speed on each interval. We don’t want to fix the number of intervals, 
so let’s call it n. We also need to have some way of describing the beginning 
and end of each small interval: 

• The first interval begins at time a and finishes at some later time t\. 
Since a is earlier than ti, we can say that a < t\. In fact, it will be 
useful to also let to = a, so that we have a = to < h. 

• The second interval begins at time t\ and finishes at some later time 
so that ti < 

• The third interval goes from to ts, where < ts ， 

• Keep going in the same way, so that the jith time interval starts at time 
tj-i and ends at time tj. 

• The second-to-last interval goes from t n -2 to t n _i, where t n _2 < t n -\- 

• Finally, the last interval goes from t n -i to t n , which is the same as the 
very end time b. So we have t n -i <t n = b. 

All together, we can summarize the situation by saying that 

d = 亡0〈亡 1 〈亡 2 〈亡 3 < < 亡 n —2 < 亡 n —1 〈亡 n = 办. 

We have chopped up the time interval [a, 6] into smaller intervals, which to¬ 
gether are called a partition of the interval. On the number line, it looks 
something like this: 


to = CL t\ 亡 2 亡 3 尤 4 tn—2 艺 n—1 t n = b 

The dots in the middle are supposed to show that we don’t want to fix the 
number of smaller intervals in the partition. 

That takes care of the time aspect, but we need to talk about velocities. 
Let’s suppose that the car goes at velocity v\ during the first small time 
interval This means that the graph of v against t will have a line 

segment above ( 亡。，亡 1 ) at height v\. As for the second interval, the velocity 
will then be t ； 2 , so we get a different line segment at height ”2 above (亡 1 ，亡 2 ). 
This keeps on going until the last time interval (t n -i,t n ), where the velocity 
will be v n . Overall, the picture looks like this (for example): 


V4 

^n—l 

V2 

V3 

Vl 


V n 


t 0 = a 


tl t2 ts ^4 
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Now we’re ready to calculate the total displacement. During the first small 
time interval the car has gone at velocity v\. The length of time is 

(^i — to), so the displacement will be v\ x (ti — to)- Let’s quickly repeat this for 
the second interval (ti, 亡 2 ). The speed is V 2 and the length of time is (^2 — h), 
so the displacement is V 2 x (t 2 — h). Keep doing this all the way up to the 
last time interval (t n _i,t n ). Finally, we add up all the displacements to see 
that 

total displacement = v\(t\ — to) + ^ 2(^2 — ^i) H - 

+ v n -i(t n -i — t n - 2 ) + v n (t n — t n -i). 

S This is a perfect time to whip out the sigma notation that we looked at in 
Section 15.1 above. Check that you believe that we can write the above 
x formula as 

， n 

total displacement = Vj(tj — tj-i). 

j=i 

Of course, this is also the shaded area in the above graph. 

Let’s see how the three examples from the previous sections fit into the 
framework. In each case, we know that a = 3 and 6 = 5. 

• For the first car, we just have one interval [3,5]，so set n = 1 ，亡 0 = 3, 
and ti = 5. We also know that the velocity is v\ = 50; so 

displacement = — tj-i) = v\(t\ — to) = 50(5 — 3) = 100. 

i=i 

• The second car needs two time intervals; set n = 2, to = S, ti = 4, and 
t 2 = 5, so that our partition looks like 3 < 4 < 5. On the first interval, 
the velocity is v\ = 40, while on the second interval, we have V 2 =60. 
So 

displacement = Vj (tj — tj-i) = v\ (ti — to) + V 2 (^2 — 亡 1 ) 
i=i 

= 40(4 - 3) + 60(5 - 4) = 100. 

• I’ll let you fill in the full details for the third car. Suffice it to say that 
n = 4, the partition is 3 < 3.25 < 4 < 4.5 < 5, and the velocities are 
Vi = 20, V 2 = 40, vs = 60, and ^4 = 50, so 

displacement = Vj (tj — tj-i) 
j=i 

=V^h-to) + V 2 (t 2 -t!) + V3(ts-t 2 ) + - t 3 ) 

= 20(3.25 - 3) + 40(4 - 3.25) + 60(4.5 - 4) + 50(5 - 4.5) 
= 5 + 30 + 30 + 25 = 90. 


Notice that the calculations are identical to the ones we did in the previous 
section — only the notation has changed. 
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15.2.3 Signed area 

What if our car goes backward? For example, suppose that the car goes 
forward at 40 mph between 3 and 4 p.m., then backward at 30 mph until 6 
p.m. The graph looks like this: 

:o[ : ^ : 


J_i_! — 


Now it’s really important to distinguish between distance and displacement. 
Between 3 and 4 p.m., the distance and displacement are both 40 miles. From 
4 to 6 p.m., the car travels a total of 30 x 2 = 60 miles, so the total distance 
traveled from 3 p.m. to 6 p.m. is 40 + 60 = 100 miles. On the other hand, the 
displacement is 40 + (—60) = —20 miles, since the second part of the journey 
is backward. This means that the car finishes up 20 miles back from where it 
started. 

Now look at the above graph. The rectangle on the left has area 40 (miles), 
no problem, but the right-hand rectangle is interesting. Its base has length 
2 (hours), and if you consider its height as 30 (mph), then sure enough, the 
area is 60 (miles). Adding the two areas gives 40 + 60 = 100 miles, which is 
the distance. 

On the other hand, take another look at that second rectangle. Suppose 
that we say that its “height” is actually —30 mph, since the rectangle goes 
below the horizontal axis. Of course, a rectangle can’t actually have a negative 
height, but nevertheless it would be good to distinguish between rectangles 
above and below the axis. So if the “height” is —30 mph, then the “area” is 
2 x (—30) = —60 miles. Let’s drop the quotation marks and correctly refer 
to this as the signed area. Our convention, then, is that areas below the axis 
count as negative toward the total. If we do that, then the total signed area 
is 40 miles (from the first piece) plus —60 miles (from the second), giving a 
total of —20 miles. Hey, the displacement is —20 miles! 

In terms of our formulas from the previous section, we have a partition of 
the total time interval [3,6] that looks like 3 < 4 < 6. The first velocity is 
Vi = 40 while the second is 仍 =—30. So we have 

displacement = Vj(tj — tj-i) = v\(t\ — to) + V2 (^2 — 亡 l) 
j=i 

= 40(4 -3) + (-30)(6 -4) = -20. 

If instead we take V 2 = 30, which is the speed (not the velocity!) during the 
second part of the journey, then the last sum is 40(4 — 3) + 30(6 — 4) = 100, 
















which gives the distance in miles. Of course, the speed 30 mph is the absolute 
value of the velocity —30 mph. So instead of adding up the actual (unsigned) 
area in the graph above to get the distance, we could graph |t;| against t: 



Now it’s irrelevant whether the area is signed or not because there’s nothing 
below the horizontal axis! So, we’ll make the convention that all areas are 
signed. If we want the unsigned area, we’ll take absolute values first. See 
Section 16.4.1 in the next chapter for some more on this point. 

1 樣 2.4 Gprrhnuoj§ : _locijy 

We’ve seen that if a car (or other object) moves along a straight line so that 
the velocity is constant on a finite number of intervals in a partition of [a, 6], 
then the displacement is the signed area between the graph of v versus t, the 
t-axis, and the lines t = a and t = b. The distance is the same thing, except 



What if the velocity isn’t constant on a finite number of intervals? Unless 
you never turn off the cruise control, you’ll be speeding up from time to time 
to pass another car, slowing down when you see a cop, and so on. Even 
getting from 40 to 60 mph requires some acceleration — you can’t just change 
speeds instantaneously. So, let’s consider the situation where velocity r is a 
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what happens during that interval. Even on this little interval, the velocity 
is changing, but let’s pretend that it doesn’t. Let’s sample the velocity by 
picking some instant of time c during [p, ^], and seeing what the velocity is 
then. We’ll pretend that the sampled velocity is the actual velocity for the 
whole interval [p, q] . If we write the velocity v as v(t) to emphasize that v 
is a function of t, then the velocity at time c is v(c). So, here’s a graphical 
interpretation of what we’re doing: 

v I : : : 




a P c q b t 

We’ve flattened out the curve above [p, q] at a height of v(c). The advantage 
of this is that we can get some idea about the displacement over the time 
interval \p, q]. The area of the little rectangle of height v(c) and base q — p 
is v(c) x (q — p). Now, this isn’t actually the correct displacement over that 
time period, but it’s mighty close. 

Why stop at just one little interval like [p, q]? Let’s repeat the process on 
an entire partition of [a, b]. Starting with the partition 

a = 亡0 〈亡 1 〈亡 2 < ... < 亡 n —2 < tn—l < tn = b ， 
let’s sample the velocity during each time period. The first time interval is 
from to ti, so let’s pick some time c\ in that interval and pretend that the 
velocity is equal to v(ci) for the whole period. The number c\ could be equal 
to the beginning number 亡 o or the end number or some number in between, 
as long as it lies in [to, ti]. Now, repeat this for the second interval: pick C 2 in 
the interval [ 亡 i, 亡 2 ]，and use v(c 2 ) as the sample velocity for that period. Keep 
doing this for every interval, up until c n in the interval [t n -i,t n ]. Here’s an 
example of what this could look like with n = 6: 

v 

v(c 6 ) 
v(c 2 ) 

v i c ^) 

v(c 3 ) 

v(c 4 ) 
v(c 5 ) 
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All we’ve done is approximate the nice smooth velocity curve using some 
staircase-like function, where each step intersects the curve. We can use the 
techniques from the previous sections to work out the shaded (signed) area, 
which will be an approximation to the actual area under the curve. We get 

n 

area under velocity curve = ^^v(cj)(tj — tj-i). 

i=i 

Unfortunately, the approximation is pretty lousy. That big rectangle on the 
right in the picture at the bottom of the previous page doesn’t really do a 
great job of approximating the area under the part of the curve above [^ 5 , to\, 
since there’s so much of the rectangle above the curve. So let’s take a different 
partition with more intervals which are smaller, for example: 



Here we used 16 partitions instead of 6 , and it looks as if the shaded area is 
a much better approximation to the actual area than our previous attempt 
yielded. This wouldn’t have been true if we used a lot of intervals in our 
partition, but some of the little intervals were still quite wide. For example, 
check out this picture: 



Even though most of the rectangles are pretty narrow, that one big-ass rect¬ 
angle screws up the approximation. So somehow we need to make all the little 
time intervals small. The way to do this is to let the mesh of the partition be 
the longest of all the time intervals, then insist that the mesh get smaller and 
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smaller, eventually down to 0 in the limit. That way, all the time intervals 
will become small and you won’t have a huge rectangle like the one in the 
above picture. 

Formally, the mesh is defined by 


mesh = maximum of (ti — to),(t 2 — h), … ， (t n -i — t n _ 2 ), (t n — 



For example, if you have the partition 3 < 3.25 < 4 < 4.5 < 5 of [3,5] (which 
was the partition that we used for the third car in Section 15.2.1 above), then 
the lengths of the little intervals are 0.25 (which is 3.25 — 3) ， 0.75 (that’s 
4 — 3.25), 0.5 (4.5 — 4), and 0.5 (5 — 4.5). The largest of the quantities 0.25, 
0.75, 0.5, 0.5 is 0.75, so the mesh of the partition is 0.75. 


Now we can try to replace the approximation 




by a limit to get the actual answer. Suppose we repeat the above procedure 
over and over again, each time taking a partition which has a smaller mesh 
than the previous one, so that the meshes go down to 0 in the limit. Then 
the approximations should get better and better. This is what we’re trying 
to achieve in the following formula: 


actual area under velocity curve = 


⑹ (o - 、- 1 ). 

mes ^ j=1 


For the mesh to go to 0, we need the number of small intervals in the partition 
to get larger and larger, so the limit automatically includes the idea that 
n — oo as well. 


15.2.S Two special approximaticsps 

The above formula leaves a lot to be desired. How do you know that you 
get the same answer no matter what partitions you take and no matter how 
you choose the sampling times Cj? It’s actually a theorem that if ^ is a 
continuous function of t, then the above limit is independent of the partitions 
and sampling times. The proof of the theorem is a little advanced for this 
book, but can be found in most textbooks on real analysis. On the other 
hand, we can get an idea of the flavor of the proof by investigating two special 
approximations : the upper sum and the lower sum. 

Starting with a partition, we are allowed to pick sample points in each of 
the little intervals. Suppose that we always pick a point where the velocity 
is the greatest possible. For example, we’ll choose c\ in the interval [^o,^i] so 
that v(ci) is the maximum possible value of v on that interval. We’ll do the 
same for each of the intervals. This means that all our steps lie above the 
curve. Here’s an example of what this looks like: 
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The area of the rectangles, which is called an upper sum, is clearly bigger than 
the area under the curve. On the other hand, if we always sample the lowest 
possible velocity, then we get a situation like this: 



The partition is the same, but the sampling times are different. Because of 
the way they’re chosen, all the steps lie below the curve; the area of all the 
rectangles, which is called a lower 5 乜 ?71， is less than the area under the curve. 

Combining these observations, we have 

lower sum < actual area under curve < upper sum. 

In fact, for the same partition, any choice of the sampling times Cj will lead 
to an area between the lower sum and the upper sum. If you use a sequence 
of partitions with smaller and smaller meshes, then the lower sum and the 
upper sum have the same limit (that’s what I’m not going to prove). The 
sandwich principle then shows that the formula at the end of the previous 
section makes sense. It doesn’t matter what values of 。 you choose 一 your 
sums are trapped, along with the actual area, between the lower and upper 
sums. As the mesh goes to zero, the sandwich principle ensures that your 
sums converge to the correct area. 

We now have all the tools we need to define the definite integral. This is 
the subject of the next chapter. … 









CHAPTER 16 


Definite Integrals 

Now it’s time to get some facts straight about definite integrals. First we’ll 
give an informal definition in terms of areas; then we’ll use our ideas about 
partitions from the previous chapter to tighten up the definition. After one 
(exhausting) example of applying the tightened-up definition, we’ll see what 
else we can say about definite integrals. More precisely, we’ll look at the 
following topics: 

• signed areas and definite integrals; 

參 the definition of the definite integral; 

• an example using this definition; 

• basic properties of definite integrals; 

參 using integrals to find unsigned areas, the area between two curves, and 
areas between a curve and the y-axis; 

• estimating definite integrals; 

• average values of functions and the Mean Value Theorem for integrals; 
and 

• an example of a nonintegrable function. 

16.1 The Basic Idea 

We start off with some function / and an interval [a, b). Take the graph of 
y = f (x) ， and consider the region between the curve, the ar-axis, and the two 
vertical lines x = a and x = b: 









region. Since there aren’t actually any units of length in the above 
we’ll just call them “units,” so that area is measured in “square ui 
the above picture actually had some units, like inches, marked on it, 
would be given in square inches instead.) In any case, let’s say that 
of the shaded region above, in square units, is 

f f(x) dx. 

J a 


This is a definite integral. You would read it out loud as “the integrj 
to b of f(x) with respect to x. v The expression f(x) is called the ii 
and tells you what the curved part looks like. The a and b tell yc 
the two vertical lines go, and are called the limits of integration (r 
confused with regular old limits!) or the endpoints of integration. 
the dx tells you that x is the variable on the horizontal axis. Actu 
a dummy variable — you can change it to any other letter, provided 
change it everywhere. So all the following are equal to each other: 


f(x)dx= / f{t)dt= / f(q)dq= / f(0)d/3. 


In fact, they are all equal to the same number, which is the shaded 
square units) in the above picture; the only difference is that we are i 
the a:-axis to be the t-axis, axis, or /3-axis. This doesn’t affect the 
the area! 


What if the function dips below the x-axis? The situation could 


this: 



As we saw in Section 15.2.3 of the previous chapter, it makes sens 
part of the area below the x-axis to count as negative area. If all of t 
y = f(x) between x = a and x = b actually lies below the x-axis, 
integral must be negative. In general, the integral gives the total a] 
signed area. More precisely, 


f(x)dx is the signed area (in square units) of the region betrw 
curve y = /(x), the lines x = a and x = b, and the x-bj 
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does the car go in one second? How about two seconds? The answer is given 
by the above integrals! Just replace x by t and you’re golden. First, note that 
the displacement and distance are the same thing, since the car’s going in the 
positive direction all the time. So, in the first second, we have 




These displacements are in yards, of course. 

Now, let’s take a look at another definite integral: 



To find the value of this integral, we need to draw a graph of y = 1, then put 
in the vertical lines x = —2 and x = 5. The area we’re looking for looks like 




represents 








Section 



This is true by symmetry: every bit of area above the x-axis has a correspond- 
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ing bit of area below the x-axis, just as in the above picture. This fact can 
actually save you a lot of time, since it means that you don’t have to do any 
calculations if your integral happens to fit the above template. We’ll give a 
more formal proof of the above fact at the end of Section 18.1.1 of Chapter 18. 

16.2 Definition :df#10 Definite Integral 

We have a nice working definition of the definite integral in terms of area, 
but that doesn’t really help us to calculate specific integrals. Sure, we got by 
in the last few examples, but only because we already know how to find the 
area of any triangle or rectangle. We also got lucky with that last example 
involving sin ⑷， because everything canceled out. In general, we won’t be so 
lucky. 

Actually, we’ve been in this situation before in the case of derivatives. We 
could have defined f f (x) to be the slope of the tangent to y = f(x) at the 
point (x, but that wouldn’t have told us how to find the slope. Instead, 
we defined f r (x) by the formula 

/' ㈤ =Um /(I-/ ㈤ ， 

h-^0 h 

provided that the limit exists. As we’ve observed, this limit is of the indeter¬ 
minate form 0/0， but we can still work it out in many cases. Anyway, once 
we’ve made the above definition, the interpretation is that f r (x) represents 
the slope of the tangent we’re interested in. 

Unfortunately, the definition of the definite integral is a lot nastier than the 
above definition of the derivative. The good news is that we’ve already done 
the grunt work in the previous chapter, and we can just state the definition: 



Even though that definition is wordy, it still doesn’t tell the full story! You 
also need to be aware of the following points: 

• The expression a = a：o < < ••- < x n -\ < x n = b means that the 

points $o, $ 1 ， $ 2 , … ， x n _i, and x n form a partition of the interval [a, 6], 
with xo = a on the left and x n = b on the right. The partition creates 
n smaller intervals [a ： o, ^i], [^ 1 ,^ 2 ], and so on up to [x n -i,x n ]. 

• The mesh of the partition is the maximum length of these smaller inter¬ 
vals; so we have 

mesh = maximum of ($1 — 吻 ) ， (x 2 ~xi ), …， （$ n _i — $ n _ 2 ), (x n — x n -i). 

• The numbers Cj can be chosen anywhere in their corresponding smaller 
intervals, one for each smaller interval. This is what is meant by saying 
that Cj is in [xj-i^Xj]. 
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• The above limit is taken by repeating the calculation of the sum for 
different partitions with smaller and smaller mesh, and consequently 
more and more smaller intervals; that is, as mesh —> 0, we must also 
have n —> oo. Each partition involves a choice of all the numbers Cj. 

• If / is continuous, then it doesn’t matter what partitions are used, nor 
which Cj are chosen, as long as the mesh goes to 0. In fact, this is also 
true if / has a finite number of discontinuities, as long as / is bounded. 
Such functions are referred to as integrable, since they can be integrated. 
There are functions which are integrable even though they might have 
infinitely many discontinuities, but that’s a little advanced for this book. 
On the other hand, if / is unbounded, which would happen (for example) 
if it has a vertical asymptote, then the integral is called improper; see 
Chapters 20 and 21 for how to deal with this sort of thing. 

• The sum ^ 

j=i 

which appears in the definition is called a Riemann sum. It gives an 
approximate value for the integral. If the mesh of the partition is very 
small, the approximation should be pretty good. 

See, I told you it was nasty! Now we’ll see howto use the definition to calculate 
a definite integral. 


16,2.1 

◎ 


An example of using ths.'d 鱗 riitiofl: 


Let’s use the above formula to find the following integral: 



So we are looking for the following area: 



This isn’t a triangle or a rectangle, and nothing cancels out since the area is 
entirely above the x-axis. So let’s set f(x) = x 2 and use the definition of the 
definite integral to find the area. 
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We need to take partitions with smaller and smaller meshes. By far the 
easiest way to do this is to use small intervals of equal size. So, we want to 
chop up our interval [0,2] into n pieces, each the same length. Since the total 
length is 2, and we ? re using n pieces，each piece must have length 2/n units. 
The first piece goes from 0 to 2/n; the second piece goes from 2/n to 4/n; and 
so on. Zooming in on the region of interest, here’s a picture of what we’ve 
done: 



In this case, the general partition 

a = xo < xi < X 2 < • - < x n -i <x n = b 

specializes to 

。 0 2 4 2(n - 1) 2n 

n n n n n 

The mesh of this partition is 2/n, since every smaller interval has width 2/n. 
It’s also pretty clear that the formula for a general Xj in this partition is 2j / n. 
Now, we need to choose our numbers Cj. For example, c。could be anywhere 
in the interval [0,2/n], c\ could be anywhere inside [2/n,4/n], and so on. 
We’ll make life simple by always choosing the right endpoint of each smaller 
interval, so that Cj = Xj = 2j/n. That is, 

Cj = — is our choice for the smaller interval [xj-i,Xj] = I"— —— —,—1 . 


This will lead to the following rectangles: 
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All in all, we have shown that the shaded area of the rectangles in the above 
picture (in square units) is given by 


i=i 

This is only an approximation to the area we’re looking for. Since the mesh 
of the partition is 2/n, we can force the mesh to go to 0 by letting n —> oo. 
The rectangles become smaller and smaller, but there are more and more of 
them which hug the curve y = x 2 better and better. So we have 

j: x2dx = 』巴。 自你他' - a - 1 ) = 4(n+1 3 ) n ( 2 2n+1) . 



All that’s left is to find the last limit. You can use the techniques from 
Section 4.3 of Chapter 4 to show that the limit is 8/3, so we have finally 
shown that 



The area we’re looking for is 8/3 square units. Now you should try to repeat 
the above method to show that 



As you can tell, this method is a pain in the butt. Not only is it long and 
involved, but you also need to know how to find the sum 


E 广 


If the integrand was x 3 instead of a: 2 , you’d need to deal with 

fy. 

j=i 


Things would be even worse if the integrand happened to be sin(x) or some¬ 
thing similar. So we need another method in order to avoid all these rectangles 
and sums. That will have to wait until we look at the Second Fundamental 
Theorem of Calculus in the next chapter. In the meantime, let’s look at some 
nice properties of definite integrals. 


16.3 Properties of Definite -Integra Is 



Let’s extend our definition of the definite integral a little bit. What do you 
think of 










of Definite 
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The whole area, from x = —2 to a; = 3, is clearly the sum of the two areas 
labeled I and II. By definition, we have 



f f(x) dx = f f(pc) dx f f(x) dx. 

J —2 J —2 Jl 

All we’ve done is split up the area into two pieces and express this in terms of 
integrals. Of course, we could have split up the integral using any number in 
the interval [—2,3], as long as we replaced both the Is in the above formula 
by the same number. In fact it even works when the number is outside the 
interval [—2,3]. For example, the following formula is true: 

[f{x) dx = f f(x) dx+ f f(x) dx. 

J-2 J-2 JA 

Here’s a picture of what’s going on: 
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and pull the constant C out of the sum and the limit: 


Cf(x) dx = C lim f( c j)( x j ~ x j-i) = C f{x) dx. 


For example, to find 


just drag the 7 outside the integral: 


7a: 2 dx = 7 x 2 dx 


The second property is that integrals respect sums and differences. 
That is, if / and g are both integrable functions, and a and b are two numbers, 
then 

/ b pb pb 

{f(^)-\-g{x))dx = / f(x)dx-\- / g(x) dx. 


The same is true if you change both plus signs to minus signs. Either version 
is easy to show using partitions. All you have to do is break up the sum and 
limit, like this: 




+ gicj^ixj - Xj - 


= J f(x) dx J g(x) dx. 

The same thing works with minus signs instead of plus signs. 
For example, to find 


split up the integral and also drag the constants through the integral signs. 
We get 


(3a: 2 — 5a;) dx = S x 2 dx — 5 xdx = S [ - ) — 5(2) = —2. 


Here we have used the facts from above that 


and 


xdx = 2. 





16.4 Finding Areas 

li y = f(x), then we can write 
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j:ydx 

instead of using f(x) as the integrand. This has a nice geometrical interpre¬ 
tation: if we look at one of our thin rectangles, or strips, arising from the 
partition method, we can think of it as having height y units and width equal 
to some small length dx units: 



The area of the strip is the height times the width, or y dx square units. Now 
draw in more strips so that the bases form a partition of [a, 6]. If we were 
to add up the areas of all these strips, we’d get an approximating sum. The 
beauty of the integral sign is that it not only adds up the areas of all the 
strips, it also takes the limit as all the strip widths go to 0 (in the limit). 

This idea is useful in helping to understand how to use the integral to find 
areas. Now, let’s spend a little time looking at how to find three specific types 
of areas: unsigned area, the area between two curves, and the area between a 
curve and the y-axis. 

16.4.1 Finding the unsigned area 

We’ve seen that definite integrals deal with signed areas. Sure, if your curve 
is always above the x-axis, then it doesn’t matter whether the area is signed 
or unsigned. But what if some of the curve lies below the axis? For example, 
■flli suppose that f(x) = —x 2 — 2$ + 3 and the region of interest is between x = 0 
and x = 2. Since /(0) = 3 and /(2) = —5, the curve y = f(x) looks like this: 


y = —x 2 — 2x + 3 
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If you treat the shaded area as signed, so that the area of the region labeled 
II counts as negative, then we have 


signed area = / (—x 2 -2x-\-S)dx = — / x 2 dx — 2 / xdx 3 / 1 dx. 
Jo Jo Jo Jo 

Here we’ve broken up the integral using the principles from the previous sec¬ 
tion. We also know what all three integrals are, having found them above. 
We get 


shaded signed area = 


This is clearly not the unsigned area, since it’s negative! So, how do you find 
the unsigned area? The trick is to break up the integral into pieces to isolate 
the bits of area above and below the axis, then add up their absolute values. 
In the above example, we need to know where the curve hits the a;-axis. So 
just solve —x 2 -2x-\-3 = 0 and you will see that x = 1 or x = —3. Clearly 
ar = 1 is what we’re looking for here, since it，s between 0 and 2, while —3 
isn’t. 

Now we can write down two integrals: 


(—x 2 — 2a: + 3) dx and 


(—x 2 -2x-\-S) dx. 


These represent the signed areas of regions I and II ， respectively, in the above 
picture. To calculate the integrals, you’ll need some formulas that we’ve 
developed earlier in this chapter: 


x 2 dx =-; 

! 7 

x 2 dx =-] 


xdx = 
xdx = 


ldx = I, 
ldx = 1. 


leave it to you to work out that 


(—x 2 -2x-\-S)dx= - and 


(__ 2x H - 3) dx = — ~. 


As expected, the first integral is positive since region I is above the axis, and 
the second is negative since region II lies below the axis. Also, the sum of 
the two integrals is —2/3, which is the signed area (in square units). Now, 
here’s the important point: we can get the actual area of region II just by 
ignoring the minus sign! This works because the region is entirely below the 
: r-axis. So the actual area of region II is 7/3 square units, while region I has 
area 5/3 square units. The total area is therefore 5/3 + 7/3 = 4 square units. 
Effectively, we just took the absolute value of each of the two pieces 5/3 and 
—7/3, then added them up. 
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We’ll look at another example of this in Section 17.6.3 of the next chapter. 
Note that you should use the above method in order to find the distance 
an object travels, as opposed to the displacement. Indeed, as we saw in 
Section 16.1 above, 


distance = 



so absolute values are involved and the above method applies. 


16.4.2 Finding the area between two curves 

Suppose you have two curves, one above the other, and you want to find the 
area of the region between the curves and the lines x = a and x = b. If the 
curves are y = f(x) and y = g(x), where the first is above the second, then 
the situation looks like this: 



The actual region we want to find the area of is labeled I. On the other hand, 
the region II lies under the curve y = g(x), so it has signed area 


g(x) dx. 


So what is 

f f(x) dx? 

J a 

That must be the signed area below the top curve all the way to the 怎 -axis, 
so it is actually the area of both regions put together. So we have 


J f(pc) dx = J g(x) dx + signed area of region I. 


We can rearrange this and stick the two integrals together into one integral, 
getting 


signed area of region I = 


(/(^) - 9 {x)) dx. 



So you just take the top curve’s function and subtract the bottom curve’s 
function, then integrate. For example, let’s find the following shaded area: 
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On the interval [a, b ], the function g always lies above /. (I know, I had them 
the other way around in Section 16.4.1 above!) In any case, the area under 
y = f(x) (down to the x-axis) is clearly less than the area under y = g(x) 
(down to the a;-axis). In symbols: 


if f(x) < g(x) for all x in [a, 6], then 


f{x) dx< g(x) dx. 



This is true even if one or both of the curves go below the a:-axis, thanks 
to the fact that we’re using signed areas. For example, if / is always below 
the 尤 -axis and g is always above the : r-axis, then f(x) dx is negative while 
g(x) dx is positive, and the above inequality is still true. 

The proof of the statement in the box above is quite easy using Riemann 
sums. Without getting into the gory details, you just have to take a partition 
and note that f(cj) < g(cj) for all so the whole Riemann sum for / is less 
than the corresponding sum for g. I leave it to you to take it from there. 

There’s also a nice interpretation of the above fact in terms of velocity 
and displacement. Suppose that there are two cars starting at the same place. 
The first one travels with velocity f(t) at time while the second goes at a 
velocity of g(t) at time t. Since the integral of velocity is the displacement, 
the statement in the box above means that if the first car’s velocity is always 
less than the second car’s velocity, then the first car’s displacement is less 
than the second car’s displacement. This makes a lot of sense if you think 
about it! The first car will always be more to the left of the second car on our 
mythical number line, because it just doesn’t have as much rightward oomph 
as the second car does. 


16,5.1 A.’simple type of^ltimatian；' 

We can use the above inequality to get a feel for how big or small a definite 
integral is, without actually finding the integral. For example, suppose we’d 
like to estimate f(x) dx, which is the value of the following area: 
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Let’s set M equal to the maximum value of f(x) on [a, b], and we’ll do the 
same thing with the minimum value, except we’ll call it m instead. If we draw 
in the lines y = M and y = m, then the situation looks like this: 



Notice that the area we want is less than the area under y = M, but greater 
than the area under y = m. This is easy to see by drawing some more pictures: 



It’s not hard to find the area of the two rectangles in the left-hand and right- 
hand pictures above. In the left-hand case, the base is (b — a) units and the 
height is m units, so the area is m(b — a) square units. In the right-hand case, 
the base is still (b — a) units but the height is now M units, so the area is 
M(b — a) square units. So the above graphs indicate the following principle: 


if m < f(x) < M for all x in [a, 6], then 
m(b — a) < [ f(x) dx < M(b— a). 










integral. In fact, there isn’t any nice way to express the value without using 
an integral sign or a sum which goes on forever or some other trick. We can 
at least estimate the value of the integral by using the above principle. 

We need to find the maximum and minimum values oi y = e~ x on the 
interval [0, The chain rule shows that dy/dx = —2xe~ x , which is 0 at the 
endpoint 0 and is negative otherwise. This confirms that e~ x is decreasing 
in x on the interval [0, |]; so the maximum value occurs when x = 0, and the 
minimum value occurs when x = \. Plugging these values in, we find that 
the maximum value is e~° 2 = 1, and the minimum value is e - (" 2 ) 2 = e - 1 / 4 . 
That is, on the interval [0, |], we have 

e- 1 / 4 < e~ x2 < 1. 

By our principle from the box above with a = 0 and b = 長, we have 

e_1/4 G_ 0 X e- 2 & <lQ-o). 

So the value of the integral we’re looking for lies between \e~ x ^ and Again, 
you can clearly see this by looking at the following graphs, which show the 
underestimate and overestimate, respectively: 



The areas of the two rectangles are |e _1 / 4 and \ square units, respectively. 
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The above estimates are pretty crude. We can do a better job by using 
more rectangles, or even more exotic shapes like trapezoids or parabola-topped 
strips. See Appendix B for more details. 


16.6 Averages and fH0；iMean Value.Theorem 

for Integra Is 


At last, we can return to average velocities. Yes, once upon a time, we thought 
nothing of saying that speed equals distance over time, or better still, velocity 
equals displacement over time. That’s fine as long as the velocity is constant; 
otherwise, as we saw in Section 5.2.3 in Chapter 5, we really need to say 
average velocity. 

Then we learned how to use differentiation to find the instantaneous 
velocity, knowing what the displacement is at all times during the time interval 
of interest. Using integration, we can find the displacement, knowing what 
the instantaneous velocity is at all times during our time interval. This last 
fact also allows us to find the average velocity, knowing the instantaneous 
velocity at all times. All you have to do is find the displacement and divide 
it by the total time. If the time interval goes from a to 6, and the velocity at 
time t is v(t), then we’ve already seen that 


displacement = 



Since the total time is 6 — a, we have 


average velocity = 


displacement 
total time 



More generally, we can define the average value of an integrable function / 
on the interval [a, b] as follows: 




For example, what is the average value of / on the interval [0,2], where 
f(x) = x 2 l No problem: 



All you have to do is divide the integral by the difference between the limits 
of integration. 

Let’s look at a geometrical interpretation of this. Let’s write the average 
value of / on [a, b] as / av for short. Here’s an example of what the graphs of 
y = f(x) and y = / av might look like: 
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Notice that f av is just a constant number, so the graph of y = / av is a 
horizontal line. Now, by the above boxed formula, we have 

/av = 7-^— [ f(x)dx. 
o-a J a 

Multiplying by (b — a), we see that 



/ f(x) dx = / av X (b- a). 

J a 

This actually says that the following two areas are equal: 



a b a 


After all, the rectangle in the right-hand picture has height / av units and base 
(b — a) units, so its area is / av x (b — a) square units. You can think of it 
this way: if you disturb the water in a thin long fish tank so that the water 
surface looks like y = f(x) for an instant, then after the water stabilizes, the 
surface will look like the horizontal line y = / av . 

16,6,1 The Mean Value Theorem for integrals 

In the above graphs, observe that the horizontal line y = / av intersects the 
graph of y = f(x). Let’s label the corresponding point on the x-axis as c, like 
this: 






























have /(c) = / av . It turns out that if / is continuous, then there is 
such a number c: 


Mean Value Theorem for integrals: if / is continuous on [a, b ], 


then there exists c in (a, 6) such that 


/(c) = 占 /)( 社 


In words, you could say that “a continuous function attains its average value 
at least once.” For example, we saw in the previous section that the average 
value of f(x) = x 2 on [0,2] is 4/3. According to the above theorem, we must 
have /(c) = 4/3 for some c in [0,2]. Since /(c) = c 2 , we can see that c = 禪 
is a solution which does indeed lie in [0,2] (unlike the other possible solution, 
c = 

If you think of the above theorem in terms of velocities, it just says that 
v(c) = for some c in [a, 6]. This means that for any journey, there is some 
point in time (c) such that the velocity at that time (v(c)) equals the average 
velocity (v av ). No matter how hard you try, during any journey you make, 
there must be at least one instant of time where your instantaneous velocity 
equals your average velocity. There might be more than one such instant, but 
there can’t be none. Even if you go at 45 mph for an hour and 55 mph for an 
hour, for an average velocity of 50 mph, you will still have to go at 50 mph 
for an instant while you’re accelerating from 45 to 55. 

So, why is the above theorem also called the Mean Value Theorem? After 
all, we already have a Mean Value Theorem. If you look back at our discussion 
of the original theorem in Section 11.3 of Chapter 11 ， you’ll see that we reached 
the same conclusion as we did above: the instantaneous velocity has to equal 
the average velocity at some point during any journey. The difference between 
the two versions of the theorem is that in the regular version, the conclusion 
was interpreted in terms of slopes on the graph of displacement versus time; 
whereas now we have interpreted it in terms of areas on the graph of velocity 
versus time. 

Now let’s see why the theorem is true. As we did in Section 16.5 above, 
we’ll let M be the maximum value of / on [a, 6], and m be the minimum 
value of / on [a, 6]. Could / av possibly be greater than M? If so, the situation 
would look like this: 



y = / ⑷ 


There’s no way that the area of the 
shaded region under y = f(x), since t 
situation can’t happen. In a similar \ 
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It must lie between m and M. The Intermediate Value Theorem implies that 
f takes every value between m and M (can you see why?), so in particular, 
f takes on the value / av somewhere. That is, /(c) = / av for some c and the 
theorem is true. We’ll use the theorem in Section 17.8 of the next chapter 
when we prove the First Fundamental Theorem of Calculus. 

16.7 A Nonintegrable Function 

In Section 16.2 above, I mentioned that if / is bounded and has only a finite 
number of discontinuities in [a, b ], then / is integrable. That is, the integral 
/(^) dx exists. By the way, recall that discontinuities are a deal-breaker 
as far as differentiability is concerned — if / is discontinuous at a: = a, then it 
can’t be differentiable there. (See Section 5.2.11 of Chapter 5.) Integration is 
a little more forgiving, since it can deal with some discontinuities, as long as 
there aren’t too many of them. Now, let’s look at an example of a function 
where there are too many discontinuities. 

First, remember that a rational number is a number that can be written 
in the form p/q where p and q are integers (with no common factor). An 
irrational number can’t be written in that form. Now, for x in the domain 

[0 惠 let 

1 if x is rational, 

2 if x is irrational. 

This is a pretty weird function. There are lots and lots of rational and irra¬ 
tional numbers between 0 and 1. In fact, between every two rational num¬ 
bers, there’s an irrational number, and between every two irrational numbers, 
there’s a rational number! So if you try to sketch a graph oi y = f(x), you 
might come up with the following picture: 




2 卜 . ： y = f(x) 


ol 


The values of f(x) jump between heights 1 and 2 faster than you can imagine. 
There’s no connectivity whatsoever in the above line segments at heights 1 and 
2: they are full of holes. The function is actually discontinuous everywhere. 
So what on earth should 



be? Let’s try taking upper and lower Riemann sums and see what we get. 
Pick any partition of [0,1]. No matter how narrow they are, your strips will 
pick up some irrational point. So the upper sum must look something like 
this: 
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2 


0 

Every rectangle has to reach height 2 in order to create an upper sum, even if 
the rectangles are really really thin. Notice that the area of all the rectangles 
above is 2 square units, no matter how many there are, since they fill out a 
l-by-2 unit rectangle. In particular, 

lim (upper Riemann sum) = lim 2 = 2. 

mesh—>0 mesh—^0 

Similarly, in the lower sum for the same partition, every rectangle has to be 
of height 1 unit. After all, no matter how thin a rectangle is, its base (on the 
$-axis) will still contain a rational number, and the function has height 1 at 
all rational numbers. So a lower sum must look like this: 

y = f{x) 


Now the area is 1 square unit, since the total rectangle filled in by all the little 
strips is 1-by-l unit. So we have shown that 

lim (lower Riemann sum) = lim 1 = 1. 

mesh—>0 mesh—^0 

The limits, as the mesh goes to 0, for the upper and lower Riemann sums are 
different. This doesn’t happen for continuous functions, but it does happen 
for this crazy function! The only conclusion is that / cannot be integrated 
on its domain [0,1]. We say that / is nonintegrable. Actually, there is a way 
to integrate this function, but it’s called Lebesgue integration (as opposed to 
Riemann integration) and it’s way beyond the scope of this book. So, let’s 
not worry about these sorts of pathological examples and concentrate instead 
on finding a nice way to find definite integrals of well-behaved, continuous 
functions. 


2h 









CHAPTER 17 


The Fundamental Theorems of Calculus 


Here it is: the big kahuna. I’m talking about the Fundamental Theorems of 
Calculus, which not only provide the key for finding definite integrals without 
using messy Riemann sums, but also show how differentiation and integration 
are connected to each other. Without further ado, here’s the roadmap for the 
chapter: we’ll investigate 

• functions which are based on integrals of other functions; 

• the First Fundamental Theorem, and the basic idea of antiderivatives; 

• the Second Fundamental Theorem; and 

• indefinite integrals and their properties. 

After all this theoretical stuff, we’ll look at a lot of different examples in the 
following categories: 

• problems based on the First Fundamental Theorem; 

參 finding indefinite integrals; and 

• finding definite integrals and areas using the Second Fundamental The¬ 
orem. 

17.1 Functions Based on. Integrals of Other FulliDtiOfis 

In the previous chapter, we used Riemann sums to show that 
f x 2 dx = ^ and f x 2 dx = ^. 

Jo ^ Jo ^ 

(Actually, we only did the second one; I left the first one to you!) Unfortu¬ 
nately, the method of Riemann sums was really nasty. It would be nice to 
have an easier method to find the above integrals. Why stop there, though? 
Let’s try to find 
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So we want to allow the right-hand limit of integration to be variable. Ev¬ 
eryone^ favorite variable is x, but you can’t write down 

x 2 dx 

unless you want to be really confusing. After all, x is the dummy variable, 
so it can’t be a real variable too. So let’s start over, this time using t as the 
dummy variable. First, we have 




Remember, the letter we use for the dummy variable is irrelevant — we’ve just 
renamed the x-axis to be the 亡 -axis. The actual area doesn’t change. Now we 
want to consider the quantity 



t 2 dt. 


If you substitute x = 1 into this quantity, you get t 2 dt, which is equal 
to 1/3; if instead you substitute a; = 2, you get t 2 dt, which is 8/3. Why 
stop there? You can substitute any number in place of x and get a different 
integral. That is, the above quantity is a function of the right-hand limit of 
integration, x. Let’s call the function F, so that 

F{x) = [ t 2 dt. 

Jo 


We have seen that _F(1) = 1/3 and F(2) = 8/3. How about F(0)? Well, 


m = 



In Section 16.3 of the previous chapter, we saw that an integral with the same 
left-hand and right-hand limits of integration must be 0. That is, we know 
that F(0) = 0. Unfortunately, it’s not so easy to find many other values of F, 
such as F(9), F(—7) or F(l/2). We’ll return to this point in the next section. 
In the meantime, how would you describe F(x) in words? It’s precisely the 
signed area (in square units) between the curve y = t 2 , the t-axis, and the 
vertical line t = x. 

There are two ways we can make this whole thing more general. First, the 
left-hand endpoint doesn’t have to be 0. You could define another function G 
by setting 

G(x) = J t 2 dt. 

The quantity G(x) is the area (in square units) of the region bounded by 
y = t 2 , the t-axis, and the lines t = 2 and t = x. So what is G(2)? Well, 


G(2 卜 



0 , 
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since the left-hand and right-hand limits of integration are the same. How 
about G(0)? We have 


G ⑼ 


t 2 dt. 


To handle this, remember from Section 16.3 of the previous chapter that you 
can switch the limits of integration as long as you put a minus sign out front. 
So 

rO p2 o 

G(0) = / t 2 dt = — I dt = 

J2 Jo 3 

In fact, there’s a really nice relationship between F and G. First, let’s remind 
ourselves what these functions are: 


F(x) = t 2 dt and G(x) 


t 2 dt. 


Let’s split up the first of these integrals up at ^ = 2; see Section 16.3 in the 
previous chapter to remind yourself how to split up an integral. We get 


t 2 dt : 


t 2 dt t 2 dt. 


The left-hand side is F(pc). Meanwhile, the first term on the right-hand side 
is just 8/3, while the second term is G(pc). Altogether, we have shown that 


That is, F and G differ by the constant 8/3. We can be even more general, 
though. Suppose that a is any fixed number, and set 

H(x) = j t 2 dt. 

J a 

If you split the integral in the definition of F at t = a instead of ^ = 2, you 
get this: 

nX na pX 

F(x) = / t 2 dt = / t 2 dt-\- / t 2 dt. 

Jo Jo Ja 

The second term on the right-hand side is exactly H(x), so we’ve shown that 
F{x) = [ t 2 dt + H(pc). 


So what? Well, the integral / 0 ° t 2 dt is actually a constant — it doesn’t depend 
on x at all! Even though we didn’t specify the value of a, we did say it was 
constant, so the integral must also be constant. We’ve shown that 

F(x) = H(x) + C, 


where C is some constant that depends on a but not on x. The moral of the 
story is that changing the left-hand endpoint from one constant to another 
doesn’t make too much difference. 
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Our second generalization is that the integrand doesn’t have to be t 2 . It 
can be any continuous function of t. Let’s suppose the integrand is f(t). If a 
is some constant number, then let’s define 

f(x)= r 

J a 

For example, if a = 0 and f(t) = t 2 , you get the original function F from 
above. In general, for any number x, the value F(x) is the signed area (in 
square units) between the curve y = f(t), the t-axis, and the vertical lines 
t = a and t = x. Here is an example of what this might look like for three 
different values of 



The above pictures are reminiscent of a curtain with fixed left-hand edge, 
while the right-hand edge slides back and forth. The only unrealistic aspect 
is that the curtain rod at the top is pretty warped, unless the function / is 
constant! In any case, note that the function F comes directly from the choice 
of the integrand f(t) and the number a. By splitting up the integral, you can 
show that changing the number a just changes the function F by a constant. 
All these ideas will be very important in the next couple of sections.... 


17.2 The First Fundamental lheorem 


Here’s the goal: find 


f{x) dx 


without using Riemann sums. Let’s do three things which are not really 
obvious at all: 


1. First, let’s change the dummy variable to t and write the above integral 

as b 

J f[t)dt. 

As we saw in the previous section, this doesn’t make any difference — the 
name of the dummy variable doesn’t matter. 

2. Now, let’s replace 6 by a variable x to get a new function F, defined like 

this: x 

F(x) = f f(t) dt. 

J a 

This is exactly the sort of function that we looked at in the previous 
section. Eventually we’re going to want the value of F(6), which is 
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exactly the integral in step 1 above, but first let’s see what we can 
understand about F in general. 

3. So we have this new function F. It’s like a brand new shiny toy to play 
with. Since we’ve spent so much time differentiating functions, let’s 
try differentiating this one with respect to the variable x. That is, we 
consider 

F\ X ) = l[m d t. 

Understanding the nature of F r (x) will allow us to find F(x) in general. 
Once we’ve done that, we can find F(b), which is exactly the integral 
we want. 

The expression 



might just about be the weirdest thing we’ve looked at so far in this book. 
Let’s see how to unravel it. Pick your favorite number x and find F(x). Then 
wobble x a little bit — let’s move it to a: + h, where /i is a small number. So 
now our function value is F(x + h). Here’s a picture of the situation: 




V = f(t) 


As you can see, $ and x -\- h are pretty close to each other. The values of 
F(pc) and F{x-\-h) are pretty close to each other too — they represent the two 
shaded areas above (respectively). Now, to differentiate F, we have to find 

h — h 

The difference F(pc + /i) — F(pc) is just the difference between the two shaded 
areas, which is itself just the area of the thin little region (with curved top) 
between t = x and t = x-\- h: 
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You can see this in symbols by splitting up the integral for F(x-\-h) at t = x, 
like this: 


px-\-h nx rx-\-h rx-\-h 

F(pc -\-h)= f(t) dt= f(t) dt+ f(t) dt = F(x) + / f(t) dt. 

J a J a J x J x 

Rearranging, we get 


F{x + /i)- F(x) 



f{t) dt, 


which is exactly the shaded area (in square units) of the thin strip above. 
Actually, it’s not a strip, since the top is curved, but it’s almost a strip when 
h is small. The height of the strip at the left-hand side is f(x) units, so we can 
approximate the thin region by a rectangle with base going from x to x h 
and height from 0 to f(x), like this: 


/ ⑷ 



The base of the rectangle is h units, and the height is f(x) units, so the area 
is hf(x) square units. If h is small, then this is a good approximation to the 
integral we want. That is, 

rx-\-h 

F(x + h) — F(x) = / f(t) dt = hf(x). 


Dividing by /i, we have 


F(x + /i)- F(x) 
h 


= f(x). 


The approximation gets really good when h is really close to 0. It should be 
true, then, that the approximation is perfect in the limit as /i — 0: 
v F{x + h)-F{x) 


h 




As we’ll see in Section 17.8 below, the above formula is indeed true; we con¬ 
clude that 

F'{x) = f{x). 

Let’s summarize our conclusion as follows: 


First Fundamental Theorem of Calculus: for 

f continuous on [a, 6], define a function F by 

F(x) = f f{t) dt for x in [a, b]. 
J a 

Then F is differentiable on (a, b) and F f (x) = f{x). 
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In short, you can write the whole thing as 


a ⑽ ㈣⑻. 


So our weird expression simplifies down to f(x)\ 

A common concern with this last formula is that a appears on the left- 
hand side but not on the right-hand side. This actually makes sense, believe 
it or not. Suppose that A is some other number in (a, 6), and set 




Then, as we saw in Section 17.2 above, F and H differ by a constant: 
F(x) = H{x) + C 


for some constant C. If we differentiate, the constant goes away and we see 
that F r {x) = H r {x) for all x in (a, 6). So the actual choice of a doesn’t affect 
the derivative. In terms of the curtain, we only care how fast it’s being pulled 
and how high the rail is at the right-hand point. Where it happens to be 
attached at the fixed left-hand end doesn’t affect the rate of area being swept 
out all the way over at the right-hand part of the curtain. 

雜 , 2.1 IntrqdLetiotl te Qnfrderivattves 


◎ 


Now, let’s pause for breath. We started with some function / of the variable 
as well as some number a; then we constructed a new function F of the 
variable x. Differentiating F gives us back the original function /, except now 
we evaluate it at x instead of t. Weird! 


OK, weird, but really useful. It actually solves our whole darn problem. 
Let’s see how. Suppose that f{t) = t 2 and a = 0, so that 

F(x) = [ t 2 dt. 

Jo 

The First Fundamental Theorem tells us that F f (x) = f(x). Since f(t)= t 2 , 
we have f(x) = x 2 ; this means that F r (x) = x 2 . In other words, F is a 
function whose derivative is x 2 . We say that F is an antiderivative of x 2 
(with respect to x). Can you think of any other function whose derivative is 
x 2 ? Here are a few: 

G(x) = y , H{x) = y + 7, and J(x) = y - 2tt. 

In each case, you can check that the derivative is x 2 . In fact, any function of 


x of the form 




for some constant C 


is an antiderivative of x 2 . Are there any others? The answer is no! We actually 
saw this in Section 11.3.1 of Chapter 11. If two functions have the same 















362 • The Fundamental Theorems of Calculus 


derivative, they differ by a constant. This means that all the antiderivatives 
of x 2 differ by a constant. Since one of the antiderivatives is x s /3, then any 
other antiderivative must be x 3 /3 + C, where C is constant. Wait a second — 
the weird function F above is also an antiderivative of x 2 . This means that 

F(x) = [ t 2 dt = ^— -\-C 

Jo 3 

for some constant C. Now all we have to do is find C. We know that 




t 2 dt = 0. 


So we have 


0 3 

° = y +c - 


This means that C = 0. We now have the formula we’ve been looking for: 


t 2 dt = 


X s 


Finally, we can integrate t 2 from 0 to any number! In particular, if we replace 
i by 1 and then by 2, we get our well-worn formulas 




\ 2 dt=t=i 


◎ 


This can be made even simpler — we’ll do that in the next section. First, 
I’d like to make one more point. We now have a way of constructing an an¬ 
tiderivative of any continuous function. For example, what is an antiderivative 
of e~ x ? Just change x to pick your favorite number as a left-hand limit of 
integration (let’s say 0 for the moment), and integrate to see that 


F{^) = 


dt is an antiderivative of e~ x . 


The number 0 could be replaced by any number you choose, and the same 
statement would be true. Of course, you get a different antiderivative for each 
potential choice of left-hand limit of integration. 


17.3 The Second Fundamental Theorem 


The example with f(t) = t 2 in the previous section points the way to finding 
f(t) dt in general. First, we know that the function F defined by 

f{x)= r f(t)dt 

J a 

is an antiderivative of / (with respect to x). We really want to find F(b), 
since 办 

F(b) = [ 

J a 
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We know one more thing: 

作 )= f a f(t)dt = 0, 

J a 

because the left-hand and right-hand limits of integration are equal. 

Now, suppose we have some other antiderivative of /: let’s call it G. Then 
F and G differ by a constant, so that G(x) = F(x) + C. Put x = a and you 
see that G(a) = F(a) + C\ since F(a) = 0 from above, we have G(a) = C. 
This means that 

F(x) = G(x)-C = G{x) - G{a). 

If you replace x by 6 , you get 

F(b) = G(b)~ G ⑷. 

In other words, 

[f{t)dt = G(b)-G(a). 

This is true for any antiderivative G. Notice that we’ve gotten rid of x 
altogether. So the convention now is to change the dummy variable back to 
x and also change the letter G to F, arriving at the 

Second Fundamental Theorem of Calculus: if / is continuous 
on [a, 6], and F is any antiderivative of / (with respect to x), then 

f b f(x)dx = F(b)-F(a). 

J a 


In practice, the right-hand side is normally written as F(a:)| . That is, we set 
F(x)\ b =F(b)-F(a). 

la 

So, for example, to evaluate 

J x 2 dx ， 

start by finding an antiderivative of x 2 . We have seen that x s /3 is one an¬ 
tiderivative, so 



Now just plug x = 2 and x = 1 into x s /3, and take the difference: 



which works out to be 7/3. Now, here’s another example. Suppose you want 
to find 
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We need an antiderivative of cos(x). Luckily, we have one at hand: it’s sin(a:). 
After all, the derivative with respect to x of sin (a:) is cos(ar). So, we get 


fn/6 


tt/2 兀 兀 11 

cos(a;) dx = sin(x) = sin (—) — sin ( — ) = 1 —-= 

tt/ 6 V2y V6/ 2 2 


We’ll look at more examples of this sort in Section 17.6 below. 


12.4 Indelinif^|nt@graIs 


So far, we’ve used two different techniques to find definite integrals: limits of 
Riemann sums (what a pain) and antiderivatives (not so bad). It’s quite clear 
that we’re going to have to become pretty adept at finding antiderivatives — 
in fact, that’s going to occupy us for the next couple of chapters after this 
one. So, we might as well have a shorthand way of expressing antiderivatives 
without having to write the long word “antiderivative.” Inspired by the First 
Fundamental Theorem, we’ll write 

J f(x) dx 



to mean “the family of all antiderivatives of /.” Bear in mind that any 
integrable function has infinitely many antiderivatives, but they all differ by 
a constant. This is what I mean when I say “family.” For example, 

J x 2 dx = ^- -\-C 


for some constant C. This equation literally means that the antiderivatives 
of x 2 (with respect to x) are precisely the functions x s /3-\-C, where C is any 
constant. It is an error to omit the “+C” at the end, since that would only 
give one of the antiderivatives and we need them all. 

If you know a derivative, you get an antiderivative for free. In particular: 


dx 


F(x) = f(x), then J f(x) dx = F(x) + C. 


◎ 

◎ 


The above example fits this pattern: 

d fx s \ 2 

^UJ =a:, s 

Similarly, we have 
d 


dx 


(sin (: r)) = cos(a:), 


2 j xS 

x ax — — H- G. 


cos(a:) dx = sin(a:) + C. 


One more example for now (there will be many more later!): 


l (tan " 1(a:))!= TT^ 


+ X 2 


dx = tan -1 (re) + C. 


Again, the number C is an arbitrary constant. It’s just the nature of things 
that differentiable functions have only one derivative whereas integrable func¬ 
tions have infinitely many antiderivatives. 
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All the above integrals are examples of indefinite integrals. You can tell 
an indefinite integral from a definite integral by noticing whether or not there 
are limits of integration. Indefinite integrals don’t have limits of integration, 
while definite integrals do. This might seem like a small difference, but these 
two objects are very different beasts: 

• A definite integral, like f(x) dx^ is a number. It represents the signed 
area of the region bounded by the curve y = / ⑷， the x-axis，and the 
lines x = a and x = b. 

• An indefinite integral, like / f(x) dx, is a family of functions. This 
family consists of all functions which are antiderivatives of / (with re¬ 
spect to x). The functions all differ by a constant. 



f x 2 dx = ^, while 

Ji 3 


J x 2 dx = ^C. 


If it weren’t for the Second Fundamental Theorem, it would be crazy to use the 
same symbol / for both of these objects. Luckily, the indefinite integral (or 
antiderivative) is exactly what you need in order to find the definite integral, 
so it makes a lot of sense to use the symbol in both cases. 

Here are two simple facts about indefinite integrals that follow directly 
from the similar properties for derivatives: if / and g are integrable, and c is 
a constant, then 




That is, the integral of the sum is the sum of the integrals, and constant 
multiples can be pulled through the integral sign. So, in particular, 


卜 2 


+ 9 cos ⑷） dx = 5 j x 2 dx 9 


5x s 


Notice that we only need one constant — even though 5x s /3 and 9 sin(x) could 
each get their own constant added to them, you can just combine the two 
constants into one by adding them up. By the way, what works for sums also 
works for differences, as well: 


卜 2 


— 9 cos(a:)) dx = 5 x 2 dx — 9 / cos(a:) dx : 


5a: 3 


— 9 sin(a:) + C. 


Again, only one constant is needed. 
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Before we look at some more examples, I want to make one more comment 
about the two Fundamental Theorems. The First Fundamental Theorem says 
that x 

紅 md t = m. 

In some sense, the derivative of the integral is the original function. You just 
have to be careful about what you mean by the “integral,” bearing in mind 
that the variable has to be the right-hand limit of integration, not the dummy 
variable. Now, the Second Fundamental Theorem says that 

J f{x)dx = F(a;)| 

where F is an antiderivative of /. This means that f(x) = -^F{x). We can 
therefore rewrite the above equation as 

f -^-F{x) dx = -F(ir)| 

Ja \ a 

which can be interpreted as saying that the integral of the derivative is the 
original function. Again, it’s not really the original function: it’s the difference 
between the evaluations of the original function at the endpoints a and b. 
Even with all this vagueness, it should still be clear that differentiation and 
integration are essentially opposite operations. 

Now, let’s see how to use the Fundamental Theorems to solve problems. 


17 . 5 - 
◎ 


How to Solve Problems: Tfep Fi{|f 
Fundamental Theorem 

Think about how you’d find the following derivative: 

i j: 峽 dt . 


You could try to find the indefinite integral / sm(t 2 ) dt, then plug in x and 3 
and take the difference; this will give 



sin(t 2 ) dt^ 


which you could finally differentiate. Why go to all that work when the 
derivative and integral effectively cancel each other out? After all, if you 
wanted to find (-s/54756) 2 , you wouldn’t waste time looking for V54756 when 
you just have to square it again. You’d just write down the answer 54756 and 
be done with it. Similarly, we can use the First Fundamental Theorem from 
above to say that 

去 j sin(t 2 ) dt = sin(a: 2 ). 

All you have to do is take the integrand sin ( 亡 2 ) and change t to x. The number 
3 doesn’t even come into it (see Section 17.1 above for a discussion of this). 
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◎ 


17.5.1 

◎ 


'#.5.2 

◎ 


By the way, it would be a mistake to put a “+C” at the end: you are finding 
a derivative, after all, not an antiderivative! 

Of course, you have to be versatile — the letters can change around. For 
example, what is 



Just replace why z in the integrand and see that 


d 

dz 


J 2 cos ( w2 ln(w+5)) _ 2 cos ( z2 ln(2：+5)) 


Note that —e is a constant, but once again this could have been replaced 
by any other constant and the answer would be the same. (By the way, the 
integral only makes sense ii z > —5.) 

That’s really all there is to the basic version, where the variable (that 
you’re differentiating with respect to) is just sitting there on the right-hand 
limit of integration. All you have to do is replace the dummy variable in the 
integrand with the real variable. There are four variations that can arise, 
however: let’s look at them one at a time. 


Variatisn ： 1 : variable left-hand limit-gf integration 

Consider 

d f 7 

—y i 3 cos(t ln(t)) dt. 

The problem is that the variable x is now the left-hand limit of integration, 
not the right-hand one we’ve been used to. No problem~~just switch the x and 
7 around, introducing a minus sign to compensate for this (see Section 16.3 
in the previous chapter to remind yourself why this works). You get 

[ t s cos(t\n(t)) dt — [ t 3 cos(t \n(t))dt 

dx J x dx \ J 7 

Now pull out the minus sign and use the First Fundamental Theorem to see 
that this is equal to 

—x s cos(a:ln(a:)), 

if a 〉 0. In effect, all we are doing is taking the integrand, replacing the 
dummy variable t by x, and putting a minus sign out front. It’s important to 
justify the minus sign and the use of the First Fundamental Theorem by first 
switching the limits of integration, as we did in the above example. 


Variati^h 2 one tricky limit ef in| 戴 ㈣ ort 

Here’s another example: 

d f x2 

— / tan _1 (t 7 + St) dt. 

dx J 0 

Because the right-hand limit of integration is x 2 , not x, we can’t just use the 
First Fundamental Theorem directly. We’re going to need the chain rule as 
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and remind yourself that you’re looking for dy/dq. Now set u = sin(g), so 


f u 

J tan(cos ⑷) da. 
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By the First Fundamental Theorem, 

[ tan(cos(a)) da = tan(cos(w)). 

cos ⑷， so the chain rule equation above 
=tan(cos(w)) cos ⑷. 

Finally, replace u by sin ⑷ to see that 

tan(cos(a)) da = tan(cos(sin ⑷ )）cos ⑷. 


Since u = sin(g), we have du/dq 
becomes 

dy dy du 
dq du dq 


d 广⑷ 


◎ 


dq h 

You might also encounter both of the above variations in the same problem. 
For example, to find 

d /* 4 

—/ tan(cos ⑷） da, 

句 Jsin[q) 

start by switching the limits of integration, introducing a minus sign as you 
do so: 


[ tan(cos(a)) da : 
句 Jsin(q) 


'T q l 


tan(cos ⑷） da. 


Now you can find the right-hand side as we did above; the final answer will 
be the same, except for that minus sign out front: 

d f 4 d /* sin ⑷ 

—/ tan(cos(a)) da = ―― / tan(cos(a)) da 
Jsin(q) J4 

= — tan(cos(sin(^))) cos(^). 


5 3 Voriatiott'3 two tricky limits^bf inl^ration 

© Here’s an even harder example: 

,,x 6 


\n(t 2 — sin ⑷ + 7) dt. 


Now there are functions of x in both the left-hand and right-hand limits of 
integration. The way to handle this is to split the integral into two pieces at 
some number. It actually doesn’t matter where you split it, as long as it is at 
a constant (where the function is defined). So, pick your favorite number —— say 
0 — and split the integral there: 


严 6 

—/ \n(t 2 — sin(t) + 7) dt 


ln(t 2 — sin ⑷ + 7)dt+ ln(t 2 — sin(^) + 7) dt 
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WeVe reduced the problem to two easier derivatives. The first one is a combi¬ 
nation of the first two variations above. Just switch the limits of integration, 
introducing the minus sign, to write 


ln(t 2 — sin ⑷ +7)dt- 


ln(t 2 — sin(t) + 7) dt. 


Now use the chain rule by setting u = x 3 and following the method from the 
previous section. You should check that the above derivative works out to be 


-5a: 4 ln((o: 5 ) 2 -sin(^ 5 ) + 7) 
As for the other derivative above, 


-5a: 4 lnh 1 


i0r 5 ) + 7). 


dx J 0 


\n(t 2 — sin ⑷ + 7 ) 成 


there’s no need to switch the limits of integration ― just set v = x 6 and apply 
the chain rule once again. You should check that the above derivative is equal 
to 

6 a : 5 ln((a : 6 ) 2 — sin(a; 6 ) + 7) = 6 a : 5 ln(a : 12 — sin(a: 6 ) + 7). 

Putting it all together, we have shown that 

a 严 6 

—/ \n(t 2 — sin ⑷ + 7) 也 

Jx 5 

=— 5a: 4 ln(a : 10 — sin (: r 5 ) + 7) + 6 a ; 5 ln(a : 12 — sin(a: 6 ) + 7). 

17.5.4 Variation 4: limit is a derivative in disguise 

Here’s an example which looks a little different: 


◎ 


1 fx+h 

lim — J log 3 (cos 6 (t) + 2 ) dt. 


This isn’t a derivative —— it’s a limit. Actually, it is a derivative in disguise (see 
Section 6.5 in Chapter 6 for a discussion of these types of limits). The trick 
is to set ^ 

F(x) = j log 3 (cos 6 (t) + 2) dt 

for some constant a. You can put in a specific constant if you like, or you can 
just leave it as a. It doesn’t matter, because in any case we have 


F(pc + h)~ F(x)= 


log 3 (cos 6 (^) + 2 ) dt. 


Check that you believe this, or look back at Section 17.2 above. In any case, 
in terms of our function F, we have 


log 3 (cos 6 ⑷ -\-2)dt = lim 


F(x + /i)~ F(x) 


= F\x). 
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If .6 


◎ 


So actually, we have 

1 f x+h d f x 

lim — / log 3 (cos 6 ⑷ -\-2)dt = — / log 3 (cos 6 (i) + 2 ) dt 
h J x dx J a 

for any a you like. See, I told you that the limit was a derivative in disguise! 
To finish the problem, just apply the First Fundamental Theorem in its basic 
form to see that the above limit is just log 3 (cos 6 (a:) + 2 ). 

Howto Solve Problems ThaSecond 
Fundamental Theorem 

To find a definite integral using the Second Fundamental Theorem —— and this 
is how you want to find definite integrals, believe me — you need to find the 
indefinite integral first, then substitute in the endpoints and take the differ¬ 
ence. So let’s spend a little time discussing how to find indefinite integrals 
(that is, antiderivatives), then look at some examples of how to find definite 
integrals. This is only the beginning of the story; in the next two chapters, 
we’ll look at many more ways of finding indefinite integrals. 

Finding indefinite integrals 

As we saw in Section 17.4 above, whenever you know a derivative, you get 
an antiderivative for free. We gave some examples there, but here’s another: 
since 

i (a;4)=4a：3 ， 

we immediately know that 

J 4a : 3 dx = x A + C. 

Since constants just pass through the integral sign, we can write this as 
4 J X s dx = x 4 C. 

Now divide by 4: 



This is fine, but the quantity (7/4 is a bit silly. It’s some arbitrary constant 
divided by 4, which is another arbitrary constant. So we can just replace the 
constant C/4 by some other constant, which we’ll also call C, and get 

J X s dx = ^ C. 

Let’s repeat this for any power of x. Start off by noting that 
^(x a+1 ) = (a+l)x a -, 
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this means that 


I {a 


l)x a dx = x a+1 + C. 

If a 7 ^ —1, then a + 1 ^ 0; so we can divide through by (a + 1) and write 


x a dx 


x a+1 


+ C. 


(Once again, we replaced C/(a + l) by simply C; this is OK since C is just an 
arbitrary constant.) Now, what happens when a = —1? The above method 
doesn’t work on / x~ x dx, which is just 

Fx dx - 

On the other hand, we do know from Section 9.3 of Chapter 9 that 
d 


dx 


(In ⑷） = 


J — dx = ln(x) + C. 


This is fine, but actually we can do better. You see, 1/x is defined everywhere 
except at x = 0, while ln(a:) is only defined when x > 0. We can rectify this 
by writing 

J ^-dx = ln|a:| + C. 

Let’s check that this works. We need to show that 
d 


dx 


ln|x| : 


for all x ^ 0. When a; > 0, the left-hand side is just ln(o;) and there’s no 
problem. If a: < 0, then \x\ is actually equal to —x, so the left-hand side 
becomes 

It looks a bit weird, but remember that —x is actually positive when a: < 0. 
In any case, by the chain rule, the above derivative is 


So we have proved the formula 


■ dx = ln|a;| + C. 


See Section 17.7 below for a technicality involving this formula. In the mean¬ 
time, we can now summarize most of the basic derivatives and corresponding 
antiderivatives that we’ve seen so far in one big table. 
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Derivatives and integrals to learn: 

^-x a = ax' 1 - 1 I 

ax 

-JZ MW = 4 


d x x 
—e x = e x 
ax 

d_ 
dx 
d_ 
dx 
d_ 
dx 
d 


b x = b x \n(b) 
sin(x) = cos ⑷ 
cos(a:) = — sin ⑷ 


— tan ⑷ =sec 2 (a:) 
d 


sec(x) = sec ⑷ tan(a:) 
cot (a:) = — esc 2 (a:) 
csc(x) = — csc(a:) cot ⑷ 


dx 


石 sin_1 ㈤ = 

^ sec " 1(x) = Hvi-i 

去 sinh(a;) = cosh ⑷ 

coshfa;) = sinh(a:) 
dx 


丫 a+l 

x a dx^ - + C (if - 1 ) 


J — dx = ln|a:| + C 
j e x dx = e x + C 

I bXdx=: ^bj +c 

J cos(x) dx = sin ⑷ + C 
/ sin(a:) dx = — cos(x) + C 
J S ec 2 (x)dx=ta,n(x)+C 
/ sec (a:) tan ⑷ dx = sec(x) + C 

/ cs ，—c 

J esc (a:) cot (a:) dx = — csc(x) + C 

J “1 1 2 dx = sin -1 (a:) + C 
r 1 

dx = tan - (a:) + C 


+ x 2 


: dx = sec 一 1 ⑷ + C 


\x\Vx 2 — 1 
cosh(a:) dx = sinh(x) + C 

sinh(a:) dx = cosh(a:) + C 


As we’ve seen, if you replace x by the constant multiple ax in any of the above 
differentiation formulas, you just have to multiply the corresponding formula 
by a. For example, 

去 tan(7a:) = 7 sec 2 (7a:). 

What if you integrate instead? Now the rule of thumb is that if you replace 
x by ax, then you have to divide by a. For example, 

J sec 2 (7a:) = ^ tan(7a:) + C. 

You can see this directly from the previous equation by dividing by 7. Here’s 
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◎ 


another example: 


J e~ x ^ dx. 


You can think of x as having been replaced by —1/3 times x\ so divide by 
—1/3, like this: 



J e~ x / 3 dx = zjJ^ e ~ X/3 +C= -3e _x/3 + C. 

How about one more for good measure? Consider 


J l + 2x 2 


dx. 


This can be written as 

and now you can consider x as being replaced by y/2x. So divide by y/2 to 
get 

[ - ~-dx = -^= tan - 1 (y/2x) + C. 

J 1 + (V2x) 2 V2 、 ’ 

There are many more complicated techniques for finding antiderivatives which 
we’ll look at in the next two chapters, but it certainly doesn’t hurt to remem¬ 
ber this simple one, since constant multiples do come up often in integrands. 


17.6.2 Finding definite integrals 


The Second Fundamental Theorem tells us that to find 


f(x) dx, 


◎ 


just find an antiderivative, plug in x = b and x = a, and take the difference. 
We’ve already looked at some examples of this in Section 17.3 above; let’s 
look at five more. First, consider 


x 4 dx. 


By the formula 


■ ^.a+l 

x a dx = - - + C, 


we know that an antiderivative of x 4 is x 5 /5. No need for the constant — 
you can choose any antiderivative, so just choose the one with (7 = 0 for 
simplicity. So, we have 



=(¥)-(¥) = ( 警)-⑼ 


33 

"5" 


It’s important to use parentheses to make sure you don’t screw up the minus 
signs! Now, you might be wondering what happens if you did happen to use a 
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different antiderivative. Well, the constant will just cancel out. For example, 
if you chose the antiderivative /b — 1001 instead, you’d get 



( f _ 10 ° 1 )|: 1 = (誓 - 1001 )_( 亨 - 1 ，) 
(譬)- 1001 -(爭 ) + 1001 . 


Notice that the —1001 and +1001 terms cancel and we’re left with exactly 
what we had before. The moral of the story is to omit the constant C when 
calculating a definite integral. 

Here’s our second integral: 



The factor 4 can just pass through the integral sign, so we need to use the 
formula 

J — dx = ln|a:| + C 

from the above table to see that 4 In|a:| is an antiderivative for 4/x. So we 
have 

f —dx = 41n|x|| = (41n|—11) — (41n|—e 2 |) = 41n(l) — 41n(e 2 ) = —8. 

J-A X \- e 2 

Here we have used the facts that ln(l) = 0 and ln(e 2 ) = 21n(e) = 2. 

The third example is 


(sec 2 ⑻ -5 sin ( 誉 )） 


You should mentally split up the integrand into two components, sec 2 ⑷ and 
sin(a:/2), ignoring the constant 5 outside the second integral. By the above 
table, an antiderivative of sec 2 ⑷ is tan(a:). As for sin(a:/2), an antiderivative 
is — cos(x/2) divided by \, since x has been replaced by the constant multiple 
\x. This works out to be —2cos(a:/2) (since dividing by \ is the same as 
multiplying by 2). Altogether, we have 


J (sec 2 (x) — 5sin dx = (tan(:r) — 5 x (—2cos ( 誉 )))| 
Simplifying and substituting, we get 

(tan(7r/3) + 10 cos (^^)) — (tan(0) + 10 cos (O 


you should check that this works out to be 6\/3 — 10. 
Here’s the fourth example: 



Xy/X 


dx. 
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The trick here is to write the integrand as x 一 3 / 2 ; make sure you believe this! 
Now we can just use the formula for f x a dx from our big table in the previous 
section to get 


Xy/X 


◎ 


dx= dx = ——x~ 1/2 = (-2(9) _1/2 ) - (-2(4)- 移气 ) 

_1 / 2 U 

2 2 1 
= _ 3 + 2 = 3 - 


Now, our final example for this section is 

广 6 dx 

Jo Vl — 9x 2 

Don’t let the dx on the top worry you —— this is just an alternate way of writing 

广1/6 ! 

: dx. 


Jo VI - 9x 2 

Express the 9a: 2 term as (3a:) 2 to see that 
[寧 dx Z* 1 / 6 1 


Jo Vl - 9x 2 
We have used the integral 


Vl — X 2 


, =dx = - sin 一 1 (3a;) 

Vl- ㈤ 2 3 I 


dx = sin -1 (a:) + C 


replaced 


from the above table, except that we have divided by 3, since x 
by 3$. Now let’s substitute to see that our integral becomes 

(备 sin_i ( 3 x ■)) - G sin_i(3 xo) ) = G x i)- (o)= ^- 

Here we’ve used the fact that sin _1 (^) = 7r/6. 

17.6.3 Unsigned areas and absolute values 

In Section 16.1.1 of the previous chapter, we saw that 

/ sin(a:) dx = 0 

-7T 

because the area above the axis cancels the area below the axis. Here’s a 
recap of the graph of the situation: 


y = sin(rr) 
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We can check the above integral using antiderivatives : 

/ sin(a:) dx = — cos(a:) = (— cos(7r)) — (— cos(—7r)) = —(—1) + (—1) = 0. 
-7T I-7T 

How about finding the unsigned, actual area in the above picture? We looked 
at a method for doing this in Section 16.4.1 of the previous chapter: the actual 
area in square units is equal to 


J I sin (a:) I dx. 

Our method calls for splitting the original integral 

/ sin(a:) dx 

-7T 

at the ^-intercept 0, then taking the absolute value of each piece. That is, 

/ sin ⑷ 

'o 

I leave it to you to use the antiderivative — cos(x) to show that these two 
integrals are —2 and 2, respectively. If you just add these numbers, you get 
the signed area 0 square units; but if you take the absolute values first, you 
get the actual area, which is |—2| + |2| =4 square units. 

Now, let’s look at an example of finding the area between two curves. We 
already saw how to do this in Section 16.4.2 of the previous chapter, but now 
we have the power of the Second Fundamental Theorem at our disposal, so 
we can find more exotic areas like this one: 


I sin ⑷ I dx ' 


sin(x) 



We’re looking for the area between the curves y = y = 1/x, and the line 
x = 2. Well need to find where y = x and y = 1/x intersect: set x = l/x and 
we see that x 2 = 1. This means that o; = lora; = —1. In the above picture, 
the x-coordinate of the intersection point is positive, so we need x = 1. Since 
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y = xis above y = 1/a:, we take the top function x minus the bottom function 
1/x and integrate: 

㈠ 

An antiderivative of x is x 2 /2, as we can easily see by using the formula 
Jx a = x a+1 /{a-\- 1) + C with a = 1; also, an antiderivative of 1/x is ln|a:|, as 
we saw above. So the above integral is equal to 




This simplifies to 3/2 — ln(2), so the area we want is 3/2 — ln(2) square units. 
Now, consider what happens if the area we actually want to find is this instead: 



It’s tempting to write this area as 



but that would be a load of bull. You see, the curve y = x isn’t on top of 
y = 1/x between 1/2 and 1. We discussed this point in Section 16.4.2 of the 
previous chapter, and saw that we actually need to take absolute values: 


new shaded area = / x - dx. 

Ji/2 x 


Since the only intersection point is at a: = 1, split the integral up into two 
pieces there and take the absolute value of each piece to get 










This triangle has base and height equal to 3/2 units, so its area works out to 
be 9/8 square units, agreeing with our above answer! 
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17.7 A Technical Point 

In Section 17.6.1 above, we saw that 

J — dx = ln|a:| + C. 

Although everyone writes the formula like this, technically it’s not correct! 
You see, we want to find all antiderivatives of 1/x. Sure, ln|a:| + C is an 
antiderivative for each constant C, but actually there are more. To see why, 
let’s start off with the graph of y = ln|a;|: 





This has two pieces, either of which can be shifted up or down without af¬ 
fecting the derivative. For example, if we shift the left piece up by 1 and the 
right piece down by 1/2, the graph looks something like this: 



This function isn’t of the form ln|a:| + C, but its derivative is still 1/x. So 
we really need to allow two constants, possibly different — one for each of the 
two pieces of the curve: 



fln|a:| +Ci 
|ln|a:| +C 2 


if a: < 0, 
if x > 0, 











Section 17.8: Proof of the First Fundamental Theorem • 381 


The reason we can get away without this level of formality, at least most of 
the time, is that we only really use one of the constants at a time. Consider 
the following three integrals: 


In the first integral, you are only using the right-hand piece of the curve 
y 二 1/x. Similarly, in the second integral, only the left-hand piece is relevant. 
Try doing both integrals and make sure you get 1 and —1, respectively. As 
for the third integral, now we are using both pieces of y = 1/x, but there’s 
a problem: the vertical asymptote at a; = 0 lies in our interval [—l,e]. We 
don’t know how to handle that. In fact, we will learn how to deal with this 
sort of thing when we look at improper integrals in Chapter 20. In this case, 
it turns out that the third integral above doesn’t even make sense because of 
that vertical asymptote. So the only time that definite integrals of the form 


make sense is when a and b are both positive or both negative. In either case, 
only one of the pieces of In| a: | is involved, and there’s no need to mess around 
with two different constants! 

17.8 Proof of tfie first Fiiin da mental Theorem 

， In Section 17.2 above, we gave an intuitive proof of the First Fundamental 
Theorem of Calculus. Let’s tighten it up. Recall that 

f(x)= r nt)dt, 

J a 

and we want to find F\x). We have already seen that 

rx-\-h 

F(x-^h)-F(x) = J f(t)dt. 

Suppose that h > 0. By the Mean Value Theorem for integrals (see Sec¬ 
tion 16.6.1 of the previous chapter), there is some number c lying in the 
interval [x, xh] such that 


f(t) dt = ((x + h) - x)f(c). 


That is, we have 


F{x -\-h) — F(x) = J f(t) dt = hf(c) 

for some c in [x,x + h]. Actually, this is also true if ft < 0, except that the 
interval is [x + /i, x] instead, since x-\-h < x in that case. Anyway, divide the 
above equation by h to get 

F(x + h)-F(x) 
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The important thing is that when x is a fixed number (for the moment)，the 
number c depends on h only, and it lies between x and x -\- h. Perhaps we 
should really rewrite the above equation as 

巧-, =/ ⑹ 

to emphasize that c depends on h. Now, what happens when /i —> 0? The 
quantity Ch is sandwiched between x and a: + ft, so as /i — 0, the sandwich 
principle (see Section 3.6 of Chapter 3) says that — x as ft — 0. On the 
other hand, since / is continuous, we must also have f(ch) f(x) as ft — 0. 
That is, 

F(x + h)~ F(x) 

- h - =^ 0 f(c h )=m. 

This shows that F r {x) = f{x), wrapping up the proof of the First Fundamen¬ 
tal Theorem. As for the Second Fundamental Theorem, we actually already 
proved it in Section 17.3 above, so we’re good to go! 






CHAPTER 18 


Techniques of Integration, Part One 


Let’s kick off the process of building up a virtual toolkit of techniques to find 
antiderivatives. In this chapter, we’ll look at the following three techniques: 

• the method of substitution (otherwise known as “change of variables ”）； 

• integration by parts; and 

• using partial fractions to integrate rational functions. 

Then, in the next chapter, we’ll look at some more techniques involving trig 
functions. 


18.1 Substitution 

Using the chain rule, we can easily differentiate e x2 with respect to x and see 
that 



The factor 2x is the derivative of a; 2 , which appears in the exponent. Now, 
as we saw in Section 17.4 of the previous chapter, we can flip the equation 
around to get 

J 2xe x2 dx = e x2 +C 

for some constant C. So we can integrate 2xe x2 with respect to x. How about 
just e x ? You’d think it would be just as easy, if not easier, to find 

J e x2 dx. 

It turns out that it’s not just hard to find this — it’s impossible! Well, not quite 
impossible, but the fact is, there’s no “nice” expression for an antiderivative 
of e x2 . (You have to resort to infinite series, definite integrals, or some other 
sort of roundabout device.) Perhaps you think that e x ^/2x works? Nope — use 
the quotient rule to differentiate this (with respect to x) and you’ll see that 
you get something quite different from e x ^. 
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What saves us in the case of / 2xe x2 dx is the presence of the 2x factor, 
which is exactly what popped out when we used the chain rule to differentiate 
e x . Now, imagine starting with an indefinite integral like this: 

J x 2 cos(x s ) dx. 

We’re taking the cosine of the somewhat nasty quantity x s , but there’s a ray 
of hope: the derivative of this quantity is 3a: 2 . This almost matches the factor 
x 2 in the integrand — it’s only the constant 3 that makes things a little more 
difficult. Still, constants can move in or out of integrals, so that shouldn’t be 
a problem. 

Start off by setting t = x 3 , so that the cos(x s ) factor becomes cos ( 亡 ) . Our 
aim will be to replace everything that has to do with x in the above integral 
by stuff in t alone. You might say that the above integral is in x-land and 
we’d like to migrate it over to t-land. We’ve already taken care of cos(x s ), 
but we still have x 2 and dx to worry about. 

In fact, the dx factor is really important. You can’t just change it to dt\ 
After all, t = x s , so dt/dx = 3x 2 . If there’s any justice in the world, then we 
should be able to rewrite this as dt = Sx 2 dx. Let’s not worry about what this 
means; we’ll leave that until Section 18.1.3 below. Instead, suppose we divide 
both sides by 3 to get \dt = x 2 dx. Then we can get rid of the x 2 and dx 
pieces from our integral at the same time, replacing both by | dt, like this: 

J x 2 cos (a; 3 ) dx = J cos(a: 3 ) (x 2 dx) = J cos ⑷ dt 

The middle step isn’t really necessary, but it helps to see x 2 and dx next to 
each other so that you can justify replacing them by ^dt. Anyway, now we 
can drag the factor of ^ outside the integral, then integrate; altogether, we 
have 

J x 2 cos (a: 3 ) dx = J cos(t)* dt = ^ J cos ⑷ dt = ^ sin ⑷ + C. 

It’s pretty lazy to leave the answer as ^ sin(t) + C. We started in x-land, then 
migrated over to 亡 -land; now we have to come back to a:_lancL This isn’t hard 
to do: just replace t by x s once again. We have shown that 

J x 2 cos(x s ) dx = - sin(a; 3 ) + C. 

Check that this is true by differentiating | sin($ 3 ) with respect to x. 

Let’s look at some more examples. First, consider 

J e 2x sec 2 (e 2x ) dx. 

Since we’re taking sec 2 of the annoying quantity e 2x , let’s replace that quantity 
by t. So substitute t = e 2x . Differentiate this to see that dt/dx = 2e 2x . Now 
throw the dx onto the right-hand side to see that dt = 2e 2x dx. That’s almost 
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what we have in the integral- 
divide by 2 to get \dt = e 2x d 


-we just need to get rid of the factor of 2. So 
Moving the above integral to 亡 -land, we get 


J e 2x sec 2 {e 2x )dx = J sec 2 (e 2x ) (e 2x dx) = J sec 2 ⑴ (■ 也 )• 

Now pull out the factor of \ and integrate to get tan ⑷ + C. Finally, move 
back to 怎 -land by replacing t with e 2x . We have proved that 


J e 2x sec 2 (e 2x ) dx 


: tan(e 2x ) + C. 


Again, you should check this by differentiating the : 
Here’s another example: 


ight-hand side. 


3a; 2 + 7 
X s -\-7x — 9 


dx. 


This looks pretty difficult. Fortunately, if you differentiate the denominator 
X s -\-7x — 9, you get the numerator 3x 2 + 7. This suggests that we substitute 
t = x s -\-7x — 9. Since dt/dx = 3x 2 + 7, we can write dt = (3x 2 + 7) dx. In 
t-land, our integral is 


j Jl 7 V-% dx= j x ^ + \ x -^ + ^ d ^=^ 

Now switch back to x-land by replacing t with x s -\-7x- 
f 3a: 2 + 7 


— dt = ln|t| + C. 
9; this shows that 


X s -\-7x — 9 


dx = Ijijtr 3 -\-7x — 9\ C. 


Actually, this is a special case of a nice fact: if / is a differentiable function, 
then 


J dx = ln|/(a:)| + C. 


So if the top is the derivative of the bottom, then the integral is just the log of 
the bottom (with absolute values and the +C) • We can prove this in general 
by making the substitution t = f(x). Then dt/dx = so we can write 

dt = f(x) dx. See if you can follow each step in this chain of equations which 
migrate from x-land to i-land, then back: 


[ m 
J m 


dx = 


J^(f(x)dx) = j - t dt 


ln|t|+C = ln|/(ar)|+C. 


This fact means that in the above example, 
f 3a: 2 + 7 


x s -^7x-9 


dx, 


you can just write down the answer ln|a: 3 H- 7x — 9| + C, since the top is the 
derivative of the bottom. Sometimes the top is a multiple of the derivative of 
the bottom, like this: 


a: 2 + 8 


dx. 
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The derivative of the bottom is 2x, but we only have x on the top. No 
problem — multiply and divide by 2, like this: 


x 2 + 8 


-II 


x 2 +8 


dx. 



Now you can just write down the answer \ ln|$ 2 + 8| + C, since the top (2x) 
is the derivative of the bottom (x 2 + 8). Finally, consider 



^Mx) dx - 


The nicest way to do this is to rewrite the integral as 


I W) dx ' 


then notice that the derivative of the bottom (ln(x)) is the top (1/a:). By the 
formula in the box above, the integral is just ln|ln(a:)| + C. That is, 

J^) dx = lnlHx)l+a 


1E1.1 |y ： i3#itutiari and delinite integrals 



You can also use the substitution method on definite integrals. There are two 
legitimate ways to do this. For example, to find 




x 2 cos(x 3 ) dx, 


you could find the indefinite integral f x 2 cos(x 3 ) dx first, then plug in the 
limits of integration. We actually found this indefinite integral in the previous 
section; to recap, we made the substitution t = x s , noting that dt = Sx 2 dx 
so \dt = x 2 dx, then wrote 


x 2 cos (a; 3 ) dx = J cos(t) • = 、 j cos(t) dt = ^ sin ⑷ + C = ^ sin(a: 3 ) + C. 


It’s really important to go back to a:-land at the last step. Anyway, the 
important thing is that we have found an antiderivative for x 2 cos(x 3 ), and 
we can use the Second Fundamental Theorem from Section 17.3 of the previous 
chapter to write 


x 2 cos(a; 3 ) dx = 备 sin(a: 3 )| = (■ sin((^/?r/2 ) 3 )) —(臺 sin(0 3 ) 

which works out to be So one way to use the substitution method on a 
definite integral is to focus on the indefinite integral first, then after you’ve 
found it, plug in the limits of integration. 

There’s a snazzier method, though! You can keep the whole thing as a 
definite integral the whole way through, provided that you also move the 
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limits of integration over to 亡 -land as well. In our example, we substituted 
t = x 3 and used \dt = x 2 dx to help move the integral to 亡 -land. Now, when 
x = 0, we have t = 0 3 = 0, so we can leave the left-hand limit of integration 
as 0. On the other hand, when x = W2, we have t = (W2)= 丌 /2. 
This means that we must change the right-hand limit of integration to 7r/2. 
Altogether, here’s the effect of the substitution: 


cos{t) dt. 


r i /*^/2 
x 2 cos(a: 3 ) dx = 

〜 o 

We’ll finish this soon, but first note that it would be a major error to write 


1 

3X 


cos(t) dt 


on the right-hand side instead. Since we’re integrating with respect to t, not 
x, the limits of integration must refer to relevant values of t. In fact, we can 
make things clearer by writing out the limits of integration in terms of the 
variable of integration, like this: 


/ tt/2 

x 2 cos(x s ) dx = 



cos(t) dt. 


This really highlights what’s going on: when x = 0, also t = 0; but when 
x = W2, we see that t = 7r/2. So, all in all, we’ve substituted three things: 

1. the dx bit — that became something to do with dt, burning up some of 
the other x stuff in order to make the change; 

2. all the remaining terms in the integrand involving x, so that they became 
terms in t; 

3. the limits of integration. 


Let’s finish the problem. The best way to set it out is to make a working 
column at the left of your page, like this: 


t = x s 

dt = 3a: 2 dx, so x 2 dx = ^ dt 

when : r = 0, Z = 0 

when x = ^Ar/2, t = 丌 /2 


r 厂 tt/2 1 

x 2 cos (a: 3 ) dx = - cos(t) dt 

1 F/2 ° 


= Q S in( 7 r/2))-Qsin(0))=l 


Note that the entire left-hand column is filled in before we even get to the first 
equality of the right-hand column, since we have to use all the information 
there to get to ^-land. 

Here’s a trickier one: 


rVs /2 

J1/V2 


sin _1 (a:)\/l — x 2 


dx. 
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To get the final simplified answer, notice that we had to know the log rules 
(see Section 9.1.4 of Chapter 9). It’s a really good idea to have these at your 
fingertips. 

By the way, if you’re particularly eagle-eyed, you might notice that the 
above substitution is actually a special case of the rule from the end of the 
previous section. This provides an alternative way of finding our integral 

/*\/3/2 i 


Ji/y /2 sin -1 ( 0:)^1 — x 2 

Let’s start with the indefinite integral, and rewrite it like this: 

l/\/l — x 2 


sin _1 (a:)\/l — x 2 


=/. 


sin -1 (a:) 


■ dx. 


Notice that the top is the derivative of the bottom, so we just have to take 
the log of the absolute value of the bottom to see that 

dx = ln|sin _1 (a:)| + C. 


J sin _1 (a:)\/l — x 2 

Now to find the definite integral, you can substitute the original limits of 
integration, y/3/2 and 1/v^，one at a time into the expression ln|sin _1 (a:)|, 
then take the difference. I leave the details to you. 

Here’s a different sort of problem involving substitution. At the end of 
Section 16.1.1 of Chapter 16, we claimed that 


f(x) dx = 0 for any a. 


Ask yourself this: do you see any term somewhere whose derivative is also 
present? Hopefully, you do: the derivative of sin -1 (a;) is 1/y/ l — x 2 . So try 
the substitution t = sin -1 (a:). Yes indeed, dt/dx = l/\/l — x 2 , so we have 


Vl — x 2 


dx. 


We also have to transfer the limits of integration to 亡 -land by plugging in 
x = l/-s/2 and x = V^/2 into the equation t = sin -1 (a;), one at a time. You 
should get t = 7r/4 and t = 丌 /3, respectively, provided that you remember 
your inverse trig basics! (See Chapter 10 to refresh your memory.) Putting 
everything together, we get: 



if / is an odd function, then 
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How would you prove that this is true? Start off by splitting up the integral 
at a; = 0: 

pa 广 0 pa 

/ f(x) dx = f(x) dx-\- f(x) dx. 


In the first integral on the right-hand side, let’s substitute t = —x. Then 
dt — —dx\ also, when t = —a, we see that x = a ， and when t = 0, a: = 0 as 
well. So we have 


f(x) dx = - f(-t) dt= f(~t) dt. 


In this last step, we used the minus sign to switch the bounds of integration. 
Now, since / is odd, we know that f{—t) = This shows that 



f{~t) dt=- [ f{t) dt. 
Jo 


Now if we switch the dummy variable back to x, we see we’ve proved the 
following nice result: 



This is only true when / is an odd function! Anyway, we can finish by going 
back to our first equation and using our nice result: 

/ *0 疒 a 疒 a pa 

f(x) dx= f(x) dx + f(x) dx = — f(x) dx f(x) dx = 0 . 
J-a Jo Jo Jo 

We’re all done! 



18.1.2 How to decide what to substitute 

How do you choose the substitution? Good question. The basic idea is to 
look for some component of the integrand whose derivative is also present as 
a factor of the integrand. In the integral 


sin 一加 VI-# 


dx, 



the substitution t = sin -1 (a;) works because its derivative 1/Vl — x 2 is right 
there, waiting for us to use it. The same substitution would work on any of 
the integrals 




and 


[ , 1 

』 ysin _1 (a:)(l - x 2 ) 


dx. 


In t-land, these integrals become 



and 



dt. 
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(You could also have seen this by solving for x all the way down too: = |(t 5 —2) 
and then differentiating with respect tot.) Now let’s look back at the integral. 
There are three pieces: \/3x + 2, and dx. The second piece is just t itself, 

and we have just worked out the third piece in the above equation. How about 
the first piece, x? Well, we know 亡 5 = 3a: + 2, so we can rearrange this to get 
x = |(t 5 — 2). All in all the integral becomes 


J x\/3x -\-2dx = 


/ 备 (p _ 2 )W x dt. 


Now we can multiply and integrate to see that this equals 




Back to rr-land: resubstituting t = (3a: + 2) 1 / 5 gives 

A( 3a; + 2 )«^-A( 3x + 2)6/5 +( 7. 

You should try working this problem on your own, setting your answer out 
using a working column on the left, as we’ve been doing previously. Also, you 
should check that if you differentiate the answer above, you get the original 
integrand x\/3x + 2. By the way, did you notice anything different about this 
substitution from all the others we’ve done so far? It’s a subtle point, but 
in all the other examples, we had an equation like dt = (x-stufF) dx, whereas 
here, we have dx = | 亡 4 dt. This worked out quite nicely, since we just replaced 
dx directly. In all the other examples, we had to find a constant multiple of 
the x-stuff already present in order to have much of a chance. In Section 19.3 
of the next chapter, we’ll see some other examples of integrals where we can 
replace dx directly. 

In general, there are no hard and fast rules about what to substitute. You 
just have to go along with your instinct, which will be accurate only if you 
have done plenty of practice problems. You can always try any substitution 
you like. If the new integral is worse than the original one, or you can’t see 
how to migrate everything to 亡 -land, then don’t panic: just go back to the 
original integral and try something else. 

Now, before we move onto integration by parts, there are two things I want 
to deal with. One is a justification of the substitution method; I’ll do this in 
the next section. The other is to summarize the method of substitution: 

• for indefinite integrals, change everything to do with x and dx to stuff 
involving t and dt, do the new integral, then change back to x stuff; 

• for definite integrals, change everything to do with x and dx to stuff 
involving t and dt, and change the limits of integration to the corre¬ 
sponding t values as well, then do the new integral (no need to go back 
to rr-land here). Alternatively, treat the integral as an indefinite inte¬ 
gral and when you get the final answer, then substitute in the limits of 
integration. 
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18.1.3 Theoretical justification of the substitution method 

Suppose you want to make the substitution t = x 2 in some integral. You’d 
note that dt/dx = 2x, so you write dt = 2x dx. In some sense, this is a mean¬ 
ingless statement — after all, what are dt and dxl We know that dt/dx is a 
derivative, but dt and dx have only been defined as differentials in Chapter 13. 
So what does dt = 2x dx actually mean? A good way to think of it is that a 
change in x produces a change in t which is 2x times as large. We actually 
looked at this sort of thing all the way back in Section 5.2.7 of Chapter 5. 
You can run with this observation and see what it does to a Riemann sum, 
but there’s a better way: just use the chain rule. 

Here’s how to justify everything. Imagine you have done a substitution 
t = g(x), and you work your magic to end up in t-land with f f(t) dt, which 
works out to be F(t) + C for some constant C. So the 亡 -land part of the 
calculation looks like this: 

J f{t) dt = F{t) + C. 

Since t = g(x), and we have decreed that dt = g r {x) dx, the above equation 
means the same thing in x-land as 

J dx = F(g(x)) + C. 

All I did was replace both Vs by g(x) and dt by g f (x) dx. So, if we want to 
prove that substitution is a valid method, we need to show that the above 
equation is true. Let h(x) = F(g(x)); by the chain rule (see Version 1 in 
Section 6.2.5 of Chapter 6), it’s true that h r {x) = F / (g(x))g / (x). We can 
write this in terms of indefinite integrals like this: 

J (g(x))g f (x) dx = h(x) + C. 

Since h(x) = F(g(x)), we have 

J F'(g(x))g'(x)dx = F(g(x)) + C. 

Now, since / f(t) dt = F(t) + C, we know that F f (t) = f(t). Since t = g(x), 
we have F r {g(x)) = f(g(x)). The above equation becomes 

J dx = F(g(x)) + C, 

which is exactly the equation we wanted to prove! 

By the way, this nice equation allows us to prove the alternative method of 
substitution, which was discussed after the last example in the previous section 
above. (We’ll also use it over and over when we look at trig substitutions in 
Section 19.3 of the next chapter.) In the alternative method, instead of setting 
t = g(x), we set x = g(t) for some other function g, and replaced dx by g f (t)dt. 
In that case, our original integral / f(x) dx now supposedly becomes 

J dt. 
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We are now supposed to work this out and try to move back to x-land. Well, 
by our nice equation, with x replaced by t, we see that the above integral is 
equal to F(g(t))-\-C, where F is an antiderivative of /. This is just F(x)-\-C, 
which is exactly what we want. So this method works as well, and we have 
justified the method of substitution. 


18.2 frttegration by Parts 


We saw how to reverse the chain rule by using the method of substitution. 
There is also a way to reverse the product rule — it’s called integration by 
parts. Let’s recall the product rule from Section 6.2.3 of Chapter 6: if u and 
v depend on then 


d du dv 

— [uv) = v— -\-u—. 
ax ax ax 


Let’s rearrange this equation and then integrate both sides with respect to x. 
We get 


/ 


dx 


dx : 


J^(uv)dx-J 


The first term on the right-hand side is the antiderivative of the derivative 
of uv, so it’s just equal to uv + C. The +C is unnecessary, though, because 
the second term on the right-hand side is already an indefinite integral: it 
includes a +C automatically. So we have shown that 



=uv — 



dx. 


This is the formula for integration by parts. It’s perfectly usable in this form, 
but there’s an abbreviated form which is even more convenient. If we replace 
盖 dx by dv, and replace 裝 dx by du, we get the formula 




Again, this is just an abbreviation for the real formula, but it is pretty useful. 
Let’s see how it works in practice. Suppose we want to find 


J xe x dx. 


Substitution seems useless (try it and see), so let’s try integration by parts. 
We’d love to get the integral in the form J udv so we can apply the integration 
by parts formula. There are a number of ways to do this, but here’s one that 
works: set u = x and dv = e x dx. Then we certainly have / xe x dx = J udv. 

Now, to apply the integration by parts formula, we need to be able to find 
du and v as well. The first one is easy: we know u = x, so du = dx. How 
about the second one? We have dv = e x dx, so what is v? Just integrate both 
sides: f dv = f e x dx. This means that v = e x -\-C. Actually, we don’t need 
a general v like this — we just need one v that gives dv = e x dx. So we can 
ignore the +C in this situation and just set v = e x . 
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18.2.1 

◎ 


We are now ready to apply the formula for integration by parts, with 
u = X ， du = dx, v = e x ， and dv = e x dx. The easiest way to use the formula is 
to write a small version of it with generous spacing, then do the substitutions 
underneath, like this: 

f u dv = u v — f v du 

J x e x dx = x e x — J e x dx. 

Now we still have one integral left, but it’s just f e x dx^ which is e x C. 
Plugging this in, we see that J xe x dx = xe x — e x -\-C. (Technically it should 
be —C, not +C, but minus a constant is just another constant and there’s no 
need to distinguish.) 

In order to set out the calculation for du and v, I recommend writing the 
following: 

u = x v = 

du = dv = e x dx, 

and then filling in the blanks by differentiating u and integrating dv: 

u = x v = e x 

du = dx dv = e x dx. 


Then you can easily substitute into the integration by parts formula, since 
you have everything you need at your fingertips. 

Now, how on earth did we know to choose u = x and dv = e x dx? Why 
couldn’t we have chosen u = e x and dv = x dx? Well, we could have. In that 
case, we would have 


u = e x v = ^x 2 

du = e x dx dv = x dx; 

note that we integrated dv = xdx to get v = \x 2 (remember, we don’t need 
+C here). Then by the integration by parts formula, we have 


f u dv — u v — f v du 

J xe x dx = j e x "xdx = e x - \x 2 — J \x 2 "e^dx. 

There’s nothing wrong with this, but it’s not very useful. You see, the last 
integral on the right-hand side is nastier than the original integral! So we’d 
better stick with the first way above. In general, if you see e x in there, treat 
it well —— it is your friend, since its integral is also e x . The moral is that if e x 
is present, you should normally let dv = e x dx so that v is simply equal to e x . 


Some variations 

A few complications can arise. Sometimes you need to integrate by parts 
twice or more. For example, how would you find 


J x 2 sin(a:) dxl 
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Well, it’s a product, and substitution doesn’t seem to work, so let’s try inte¬ 
gration by parts. There’s no e x , but there is a sin($) which is almost as good. 
Lefs try u = x 2 and dv = sin(a:) dx. We get 

u = x 2 v = — cos(a:) 

du = 2x dx dv = sin(a:) dx\ 

here we integrated dv = sin(:r) dx to get v = f sin ⑷ dx = — cos(x) (remem¬ 
ber, no +C is needed). So we have 

f u dv == u v — f v du 

J x 2 sin(a:) dx = x 2 (— cos(a:)) — j cos ⑷) 2xdx 
= —x 2 cos (a:) + J cos ⑷ . 2x dx. 

Now we can pull out the 2 from the last integral and we would be finished 
if only we knew what the integral J x cos(x) dx was. This is a little simpler 
than the first integral, since we now have x instead of x 2 , and after all, the 
cosine and sine functions are pretty darn similar. So we integrate by parts 
again. Let’s try U = x, and dV = cos{x) dx\ I’m using capital letters since I 
already used u and v in this problem. We now have 

U = x V = sin ⑷ 

dU = dx dV = cos (a:) dx, 

so substituting in, we get 


f u dv = u v - f v dU 

J x cos(x)dx = x sin (a:) — J sin ⑷ dx. 

How about that — we know that / sin(a:) dx = — cos(ar) + C, so we get 
J x cos(a:) dx = x sin(x) + cos(x) + C. 

We’re almost done. We just have to plug this back in above and get 
f x 2 sin (a:) dx = —x 2 cos(a;) + 2x sin(a:) + 2 cos(x) + C. 



(Once again, I didn’t write +2(7 because it’s just a constant.) 

Sometimes you can integrate by parts twice but things don’t seem to get 
simpler. In this case, if you’re lucky, then you might just get a multiple of the 
original integral back at the end. Then unless you are actually unlucky, you 
can throw it over to the other side and solve, which is a neat trick. (If you 
are unlucky, then the integrals cancel out, which doesn’t help at all!) To see 
what on earth I’m talking about, here’s an example: 

J cos(x)e 2x dx. 
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In the integrand, the cosine bit and the exponential bit are both nice, but the 
exponential bit is nicer, so I’ll set u = cos (a:) and dv = e 2x dx. We get 

u = cos(a:) v = \e 2x 

du = — sin(a:) dx dv = e 2x dx. 

(Don’t forget to divide by 2 when you integrate e 2x to get v.) This gives 


cos (: r) e 2x dx = cos(rr) \e 2x — J \e 2x (— sin(a:)) dx 


^{x)e 2x 


i(x)e 2x dx. 


Now the new integral on the right is about the same level of difficulty as 
the first one, so it’s not clear we’ve gained anything at all. Nevertheless we 
persevere and integrate by parts again, this time setting U = sin(o:) and 
dV = e 2x dx. Let’s see what we get: 


U = sin ⑷ 
dU = cos(a:) dx 


Integrating by parts, we find that 
f u dv = i 




: dx. 


J sin(a:) e 2x dx = sin(a:) \e 2x — J \e 2x cos ⑷ dx 
=\ sin(x)e 2x J ^os{x)e 2x dx. 

All in all, then, we have 

J cos(x)e 2x dx = ^ cos(x)e 2x + ■(金 sin(a;)e 2x J cos ( x ) e2x 


- I cos(x)e 2x dx. 


Does this help? Well, yes —— if we notice that the same integral appears on 
both sides, and then put both integrals on the left-hand side. In fact, we can 
add \ of the integral to both sides to eliminate it from the right-hand side, 
and put in a +C to get 

^ f cos(x)e 2x dx = l - cos(x)e 2x + sin ⑷ e 2iC + C. 

Now we just multiply by 4/5 to see that 

J cos(x)e 2x dx = ^ co^{x)e 2x + ^ sm.{x)e 2x + C. 

(Once again, we don’t write +|C; we just relabel the constant and write +C.) 
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There’s one other type of integral that needs integration by parts but is in 
disguise. In particular, the integrand doesn’t appear to be a product. Some 
I integrals that fall into this category are 


ln(a:) dx, 


J{ln{x)) 2 dx, J sin- 1 (x)dx, and J tBxT x (x)dx. 



That is, the integrand is any inverse trig function (by itself) or a power of 
ln(a:). In this case, you should let u be the integrand itself, and let dv = dx. 
For example, to find 

tan 一 1($) dx, 



let u = tan 一 1 ⑷ and dv = dx. We then have 


u = tan -1 (a:) v = x 

du = - - dx dv = dx, 

1 + x 2 - 


and so (ignoring the limits of integration for the moment) 



tan 一 i(a:) dx = tan -1 (a:) x - 
= $tan—1($) — 


+ x 2 
- dx. 


Using the method from the end of Section 18.1 above, the right-hand integral 
works out to be equal to \ ln(l + x 2 ) + C (make sure you agree with this!), 
so we have 


[ tan -1 (a :)dx = ^xtan _1 (a:) — ^ln(l + x 2 )\ 

'o \ 2 Jo 4 2 


In ⑶. 



How do you get the last answer? Know thy logs and inverse trig functions! 
Make sure that you believe that the above answer is correct. Also, notice 
that we found the indefinite integral first in order to find the definite integral 
(as opposed to trying to migrate the limits of integration to w-and-^-land!). 
This is a good idea in general. That is, when solving a definite integral by 
integrating by parts, find the indefinite integral first, then substitute the limits 
of integration at the end. 


18.3 fertiat Fractions 


Let’s focus our attention on how to integrate a rational function. So we want 
to find an integral like 



where p and q are polynomials. This covers a whole slew of integrals, for 
example, 


+ 9 


dx, 


X s + 


■ dx, 


x s — 2x 2 + 3x — 7 


dx. 
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These seem a little complicated. Here are some simpler ones: 

[ — ^― - dx, [ - ~~ dx, [ 1 dx, and [ dx. 

J x-3 J (a：H-5) 2 7 x 2 + 9 J x 2 -^ 9 

S The last four integrands are all rational functions, but they are a lot simpler. 

Try to work out all of these integrals using substitution. (Hint: some substi- 
/ 〆 tutions which work are t = x — 3, t = x-\-5,t = x/S, and t = x 2 9 for the 
four integrals, respectively.) The first two of these integrals have denomina¬ 
tors which are powers of linear functions, whereas the last two have quadratic 
denominators which cannot be factored. 

So, here’s the idea: first we’ll see how to take a general rational function 
and do some algebra to bust it up into a sum of simpler rational functions; 
then we’ll see how to integrate the simpler types of rational functions. The 
simpler functions I’m talking about are all like the four above: they either 
look like a constant over a linear power, or they look like a linear function 
over a quadratic. We’ll look at the algebra first, then the calculus. Finally, 
we’ll give a summary and look at a big example. 


1 赛 3.1 The algebra of partial f 「 actio:Fl| 

Our goal is to break up a rational function into simpler pieces. The first step 
in this process is to make sure that the numerator of the function has degree 
less than the denominator. If not, we’ll have to start off with a long division. 
So in the examples 


f^dx and 
J X 2 -1 


f 5a: 2 + a; — 3 7 

J 〜 办， 


the first is fine, since the degree of the top (1) is less than the degree of the 
bottom (2). The second example isn’t so great, because the degrees of the top 
and bottom are equal (to 2). We，d have the same trouble if there were a cubic 
or higher-degree polynomial on the top. So, we have to do a long division. To 
do this, write 


◎ 


denominator j numerator 


In our example of 


5a; 2 + 丨 一 3 


J x 2 - 

here’s what the long division looks like: 


dx. 


5_ 

x 2 — 1 ] hx 2 -\-x — 3 
J 5x 2 -5 

x-h2 


The division shows that we get a quotient of 5 and a remainder of a: + 2. So 
we have 


5x 2 -hx — 3 _ x-\-2 

x 2 -l =5+ 
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If we integrate both sides with respect to x, we get 


5a: 2 -\-x — 3 


dx : 


/( 


5+ 5^t) dx - 


Now we can break up the integral into two pieces, and actually do the integral 
in the first piece, to see that our original integral is equal to 

J 5dx + J ^ 2 + _\ dx = 5a： + / x^-l 

The new integral has a degree of 1 on the top and 2 on the bottom, which is 
the way we like it. We’re now ready to proceed. 

Next, we’ll factor the denominator. If the denominator is a quadratic, 
check the discriminant: as we saw in Section 1.6 of Chapter 1, if this is 
negative, you can’t factor the quadratic. Otherwise, you can factor it by hand 
or by using the quadratic formula. If your denominator is more complicated, 
you may have to guess a root and do a long division. 

After factoring the denominator, the next step is to write down something 
called the “form.” This is made by adding together one or more terms for 
each factor of the denominator, according to the following rules: 

1. If you have a linear factor (x a), then the form has a term like 

A. 

2. If you have the square of a linear factor (x-\-a) 2 , then the form has terms 
like 

A B 


(x + a) 2 x-\- a 

3. If you have a quadratic factor (x 2 -\- ax-\- 6), then the form has a term 
like 

Ax-\- B 


x 2 ax -\-b 

Those are the most common ones. Here are some rarer beasts: 

4. If you have the cube of a linear factor (: r + a) 3 , then the form has terms 
like 

ABC 


(x + a) 3 (x + a) 2 x-\- a 

5. If you have the fourth power of a linear factor (x + a) 4 , then the form 


has terms like 


C 


(x + a) 4 (x + a) 3 (x + a) 2 


Notice that the form only depends on the denominator. The nu¬ 
merator is irrelevant! Also, when I use constants like A, B, C, and D above, 
bear in mind that you can’t reuse constants in different terms. So you need 
to keep advancing along the alphabet. In our example 


/ 


dx 
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from above, the denominator factors as (x — l)(x + 1); so we have two linear 
factors, and the form is 

A B 

- 1 - . 

x — 1 x-\-l' 

We can’t use A twice, so we used B for the second term. By the way, you’re 
playing with fire if you write the constants as Ci, Cs instead oi A, C, 
and so on. You’re less likely to make a careless mistake if you can actually 
tell the difference between the constants without having to look at tiny little 
numbers in subscripts. 

Here’s another example of finding the form. What would the form of 
any old junk 

(x — l)(x + 4) 3 (rr 2 + 4a: + 7) (3a: 2 — a; + 1) 


be? The answer is 



A B C D Ex + F Gx-^rH 

a; — 1 + (a: + 4) 3 + (x + 4) 2 + x + 4 + x 2 Ax7 3x 2 — x-\-l 

You may write these terms in a different order, or switch the constants A 
through H around; that’s OK. 

Once you’ve found the form, you should write down that the integrand 
equals the form, then multiply through by the denominator. For example, we 
just found that the form for the integrand of 


is given by 

A B 

_I - ； 

x — 1 x-\-l' 

so we write 

x + 1 _ A B 
x 2 — 1 x — 1 X + 1 * 

Actually, you’re better off writing the denominator on the left-hand side in 
the factored manner, like this: 


x -\-2 A B 

- — - + - . 

(x — l)(a: + 1) x — 1 x-\-l 

Now multiply through by the denominator (x — 1)($ + 1) to get 


x-\-2 = A(x + 1) + B(x - 1). 


Notice that the factor (x — 1) cancels in the first term on the right-hand side, 
and the factor (x +1) cancels out in the second term. Anyway, now there are 
two different ways we can proceed. The first way is to substitute clever values 
of x. If you put x = 1, then the B{x — 1) term goes away, and you get 


1 + 2 = 4(1 + 1). 
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That is, ^4 = § - Now if instead you put x = —1 in the original equation, the 
A(x + 1) term goes away: 

-1 + 2 = 5 (- 1 - 1 ). 

So B = —Alternatively, another way of finding A and B is to take our 
original equation x -\-2 = A(x + 1) + B(pc — 1) and rewrite it as 

x-\-2= (A- {- B)x + (A- B). 

Now we can equate coefficients of x to see that 1 = A B. We can also 
equate the constant coefficients to get 2 = A — B. It’s easy to solve these 
simultaneously and find that A = | and B = —^ as before. 

You might have noticed that in both of the ways we found A and we 
needed two facts. For the substitution method, we put x = 1 and then x = —1, 
whereas for the method of equating coefficients, we equated the coefficients of 
x and also the constant coefficients. We actually could have used one instance 
of each method. For example, if you put x = 1, you find that A = | as above; 
then if you equate coefficients of x, you find that 1 = AB, so B = 

In general, however many constants you have to find, that’s how many times 
you have to apply one or both of the methods, mixing and matching as you 
choose. 

All that’s left is to rewrite your integrand as equal to the form again, but 
this time with the constants filled in. So in our example, 

x + 2 A B 3/2 -1/2 

x 2 — 1 x — 1 x-\-l x — 1 x-\-l 

Now integrate both sides, pulling out the constant factors as you split up the 
integral: 

j ^i dx = lj ^Ti dx - 

We have successfully busted up our original integral into two integrals which 
are much simpler. We’ll solve these integrals very soon. 

So far, we’ve seen that we do a long division unless the degree of the top 
is less than the degree of the bottom; then we factor the denominator; then 
we write down the form; then we use one of two methods to find the unknown 
constants. Finally, we write down the integrals of the various pieces. We’ll 
see another example of how to do all this in Section 18.3.3 below. In the 
meantime, let’s do some integration. 


T8.3.2 隹 rating;1h 鲁 |)ieG0s 

We need to see how to integrate the various pieces which remain after you 
break up the original integral. The simplest type of integral is of the form 


ax -\-b 



To do this, just substitute t = ax-\-b. For example, at the end of the previous 
section, we saw that 


x-\-2 


dx 


II 




^Tl dx - 
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You can let ^ = a: — 1 to do the first integral, and t = a: + 1 to do the second. 
In both cases dt = dx, so it’s easy to see that 


◎ 


/ ^i dx = l lo ^ x - 

Here’s another example: to find 


1| - -log|a;+ 1| +C. 


dx, 


J 4a: + 5 

put t = 4x -\- 5 so that dt = A dx\ then when the integral migrates to 亡 -land, 
it simply becomes \ J 1/tdt, which is \ ln|^| + C. Finally, substitute back for 
t to see that the above integral works out to be \ ln\4x + 5| + C. 

The same trick works for a power of a linear factor in the denominator; 
for example, to find 

r i 

■ dx, 


J (4a;+ 5) 2 

substitute t = Ax 5 once again. The integral becomes \ f l/t 2 dt, which is 
—|(l/t) + C\ going back to ar-land, we have shown that 


◎ 


(4x + 5) 2 4 X 4x + 5 ^ 4(4x + 5) 


+ C. 


The difficult case involves a quadratic in the bottom, like this: 
f Ax-\- B 1 


t 2 -\-bx-\- c 


Beware! If the quadratic can be factored, then you need to do this first. This 
was the case in our previous example, 


x-\-2 
x 2 — 1 


dx. 


We factored the denominator as (x — l)(x H- 1); this eventually led to two 
integrals whose integrands had linear denominators. So there was no need 
to integrate anything with a quadratic on the bottom. Even the previous 
example, with {Ax + 5) 2 on the bottom, posed no problem, since we just had 
to deal with the square of a linear term. 

So, what’s left? The only possibility is that the quadratic on the bottom 
cannot be factored. That is, its discriminant b 2 — 4ac is negative. An example 
of such an integral is 

f 2 X : 8 

J x 2 -\-6x-\- 13 

The denominator is a quadratic with discriminant 6 2 —4(13), which is negative. 
We actually don’t have to do any of the algebra from the previous section in 
this case, since the denominator can’t be factored. There’s no need to use any 
form at all; we just have to do the integral. Here’s how: complete the square 
on the bottom, then make a substitution. (See Section 1.6 in Chapter 1 for a 
review of completing the square.) Let’s complete the square in our example: 

a: 2 + 6a; + 13 = a: 2 + 6a: + 9 + 13 — 9 = (a: + 3) 2 + 4. 
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So we have 


f x-\-S f a; + 8 

J ^ + 6 X + 13 dX = J ( X + 3y + A dX - 

Now substitute t = x 3, so that x = t — 3 and dx = dt: 


f x-^S ^ f x-^S ^ /*( t _ 3 ) + 8 f t + 5 ^ 

J x- + 6x + 13 dX = J (x + 3y + A dX = J ^+4 dt = j¥Tl dt - 


The next step is to break this last integral into two integrals and pull out the 
factor of 5, so the above integral becomes 

j ^Tl dt + b j ^Tl dt - 

The first integral is just like the ones at the end of Section 18.1 above. You 
put a factor of 2 on the top and bottom, then recognize that the derivative of 
the bottom is just the top, so you get the log of the bottom: 

J ^T4 dt= lJ ^h dt= l lnlt2+4l+c - 

Actually, since t 2 + 4 is always positive, we can drop the absolute values. 
Anyway, to do the second integral, which is 


5 


/ 


t 2 +4 


dt. 


just remember the useful formula 

j (£) +c _ 

I (You should try to prove this by differentiating the right-hand side, or by 
substituting t = au in the left-hand side.) Anyway, with a = 2, this formula 
becomes 

5 / -^T4 dt=5x h an ~ 1 {l) +c - 

So, we have evaluated our integral as 

丢 ln( * 2+4) + 暑 tan_1 ⑴ + c _ 

Now just replace t by x-\-S once again to see that 

2 ^ n (( x + 3) 2 + 4) + 暑 tan -1 ( : 3) + C. 

The expression (x + 3) 2 +4 immediately simplifies to x 2 -\-6x-\-13, our original 
denominator. There’s actually no need to expand it — just look back to where 
we completed the square and you’ll find the equation you need. So, we have 
finally shown that 


$ + 8 


x 2 -\-6x -\-13 


dx ■ 


^ ln(a: 2 + 6$ + 13) + 暑 tan - 1 ( X : 3 ) _j_ c. 
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If the original quadratic on the bottom isn’t monic, I suggest that you pull 
out the leading coefficient before completing the square. So, to find 


a: + 8 


■ dx, 


J 2x 2 + 12x + 26 
pull out a factor of 2 in the bottom to write the integral as 


II 


a; + 8 


x 2 -\-6x -\-13 


This is the same integral as before, except for the factor of \ out front, so it 
simplifies to 

• ln(a; 2 + 6a: + 13) + 易 tan- 1 ( X : 3 ) + C. 

Now, let’s summarize the whole partial fraction method, then see a big exam¬ 
ple of the whole darn thing. 


18.3.3 The method and a big example 

Here’s the complete method for finding the integral of a rational function: 


Step 1 —— check degrees, divide if necessary: check to see if the degree of 
the numerator is less than the degree of the denominator. If it is, then you’re 
golden — go on to step 2. If not, do a long division, then proceed to step 2. 


Step 2 —— factor the denominator: use the quadratic formula, or guess 
roots and divide, to factor the denominator of your integrand. 

Step 3 —— the form: write down the “form,” with undetermined constants, 
as described on page 399 above. Write down an equation like 

integrand = form. 

Step 4 —— evaluate constants: multiply both sides of this equation by the 
denominator, then find the constants by (a) substituting clever values of x\ 
(b) equating coefficients; or some combination of (a) and (b). Now you can 
express your integral as the sum of rational functions which either have con¬ 
stants on the top and powers of linear functions on the bottom, or look like 
a linear function divided by a quadratic function. 


Step 5 —— integrate terms with linear powers on the bottom: solve any 
integrals whose denominators are powers of linear functions; the answers will 
involve logs or negative powers of the linear term. 

Step 6 —— integrate terms with quadratics on the bottom: for each 
integral with a nonfactorable quadratic term in the denominator, complete the 
square, make a change of variables, then possibly split up into two integrals. 
The first one will involve logs and the second should involve tan -1 . If there’s 
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The quadratic x 2 -5x-\-9 has discriminant (― 5) 2 — 4(9) = —11; because this 
is negative, the quadratic can’t be factored. So we’re done with step 2. 

Step 3 —— the form: we have two factors, x 2 and x 2 -5x-\- 9. Don’t think of 
the first factor x 2 as a quadratic; instead, think of it as the square of a linear 
factor. It might be better to write x 2 as (x — 0) 2 to clarify this point. So the 
x 2 factor contributes 

A B 
-^ H - 

x z x 

to the form. On the other hand, the factor x 2 — 5x 9 contributes 

Cx + D 
x 2 — 5x 9' 

Altogether, we have 

8a; 2 - 19x + 18 _ A_ B Cx + D 

x 2 (x 2 — 5x +9) x 2 x x 2 — 5x 9' 

Step 4 —— evaluate constants: now we have to find the values of A, B, C, 
and D. First we multiply both sides of the above equation by the denominator 
x 2 (pc 2 — 5a: + 9) to get 

8x 2 — 19x + 18 = A(x 2 — 5a: + 9) + Bx(x 2 — 5a: + 9) + (Cx + D)x 2 . 


Notice that the bits of the denominator that appear in each term of the right- 
hand side are precisely the bits that don’t appear in the original form. For 
example, when you multiply the B/x term by x 2 (x 2 — 5a:+ 9), you knock out 
a factor of x to get Bx{x 2 — 5a: H- 9). 

Let’s try substituting a clever value of x in the above equation. The only 
value of x that will kill off much of this equation is a: = 0. If we put a; = 0, 
the above equation becomes 

18 = A(9), 

so we immediately know that A = 2. We still need to find three more con¬ 
stants, so we’d better equate coefficients of three different powers of x. Let’s 
start off by expanding the above equation, then grouping together the different 
powers of x: 

Sx 2 — 19x + 18 = Ax 2 — 5Ax -\-9A-\- Bx 3 — 5Bx 2 -h 9Bx + Cx 3 + Dx 2 
=(5 + C)x s + (A - 55 + D)x 2 -^ (-5A + 9B)x + 9A. 
Now we can equate coefficients of x s , x 2 and x, one at a time: 

coefficient of x 3 : 0 = B + C 

coefficient of x 2 : 8 = A — 5B D 

coefficient of x 1 : —19 = —bA + 9B. 

Note that the coefficient of x 3 on the left-hand side is 0, since the left-hand 
side 8x 2 — 19$ + 18 doesn’t have an x s term. (By the way, if you equate the 
constant coefficients, you get 18 = 9A, which is the same equation we got 
when we substituted x = 0 above. Can you see why this happens?) 










Anyway, we have some simultaneous equations to solve; starting at the 
last one and working back using the fact that A = 2, it’s pretty easy to see 
that B = —1, -D = 1, and C = 1. Substituting into the form that we got at 
the end of step 3, we have: 


8x 2 — 19x + 18 
x 2 (x 2 — 5x + 9) 

This means that 

f 8x 2 — 19x + 18 


x 2 — 5x 9' 


x 2 (x 2 — 5$ + 9) 


dx = 2 


/ 臺 


dx + 


x 2 — 5x-\-9 


Instead of one nasty integral, we have three simpler integrals. Let’s work them 
all out. 


Step 5 —— integrate terms with linear powers on the bottom: The first 
two of our integrals are pretty easy: 

2 J* ―— dx _ j 一 dx = - - In I cc I -|- C. 

So, there’s really not a lot to step 5 in this case. Unfortunately, step 6 is a lot 
more involved.... 











CHAPTER 19 


Techniques of Integration, Part Two 


In this chapter, we’ll finish gathering our techniques of integration by taking 
an extensive look at integrals involving trig functions. Sometimes one has to 
use trig identities to solve these types of problems; on other occasions there 
are no trig functions present, so you have to introduce some by making a trig 
substitution. After we finish all this trigonometry, there’ll be a quick wrap-up 
of the techniques from this and the previous chapter so that you can keep it 
all together. So, this is what we’ll look at in this chapter: 

• integrals involving trig identities; 

• integrals involving powers of trig functions, and reduction formulas; 

• integrals involving trig substitutions; and 

• a summary of all the techniques of integration we’ve seen so far. 

19.1 (ofegrals Involving Trig Id#ititt6s c 

There are three families of trig identities which are particularly useful in eval¬ 
uating integrals. The first family arises from the double-angle formula for 
cos(2x). In Section 2.4 of Chapter 2, we saw that cos(2a:) = 2 cos 2 (^) — 1 and 
also that cos(2x) = 1 — 2 sin 2 (a:). (Remember, you get one of these from the 
other by using sin 2 (a:) + cos 2 (a:) = 1.) For use in integration, it turns out that 
the best way to use the formulas is to solve the relevant equation for cos 2 (a:) 
or sin 2 ⑷ . So, we have 


and 


cos 2 (a:) = - (1 + cos(2a:)) 


sin 2 ⑷ =-(1 — cos(2a:)). 


It is well worth remembering these identities! In particular, if you ever have 
to take a square root of 1 + cos(anything) or 1 — cos(anything), these identities 
save the day. For example, 
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looks pretty nasty, but 

广 tt/2 


fact 


Vi~ — cos(2a:) dx= y 2 sin 2 (x) dx 
OKJ JO V 

by the second boxed identity above. (We had to multiply the identity by 2 
before using it.) Anyway, it’s very tempting to replace y2 sin 2 (a:) directly 
by V^sin(a:), but let’s do a quick reality check. The square root of A 2 isn’t 
actually A, it’s \A\. So the above integral becomes 




|sin(a:)| dx. 


◎ 


Luckily, when x is between 0 and 7r/2, the values of sin(x) are always greater 
than or equal to zero, so we can drop the absolute value signs after all! We 
have reduced things to 

「12 

v2 / sin (a:) dx\ 

I leave it to you to show that this is just y/2. 

Sometimes you have to be a little more versatile. Consider 


+ cos(x) dx. 


It looks like we want to use the first identity in the box above, but that has 
a factor of 1 + cos(2x) on one side and we need 1 + cos(a:). No problem — if 
you replace x by x/2 in the identity, and multiply through by 2, you get 


2 cos 2 ( 吾 ) =1 + cos ⑷. 


This is exactly what we need! Check ’dis: 


y/l + cos (: r) dx : 


/2cos 2 (^j dx = V2 cos ( 吾 ) 


dx. 


Now we have to be very careful. When x is between n and 2 丌 ， we see that x/2 
is between tt/2 and 7r, but cos(o:) is less than or equal to zero on the interval 
[ 丌 /2, 7r] (draw the graph to check this). So the above integral is actually equal 


(-cos(l)) dx-, 


I leave it to you to show that this works out to be 2y/2. By the way, if you 
incorrectly replace |cos(a:/2)| by cos(a:/2) instead of — cos(a:/2), you’ll get the 
answer —2\f2. This cannot be correct: the original integrand y/1 4- cos(a;) is 
always positive, so the integral must be positive too. 

Let’s move on to the second family of trig identities. These are the 
Pythagorean identities: 


I sin 2 (a:) + cos 2 (a:) = 11 | tan 2 ⑷ + 1 — sec 2 (a:) | 11 + cot 2 (a:) = esc 2 (x). | 
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◎ 


These identities are valid for any x, as we saw in Section 2.4 of Chapter 2. 
Sometimes they are obviously helpful. For example, 


V1 — cos 2 ⑷ dx 


should just be written as 


乂 \/血 2 ㈤ 


dx : 


|sin(a:)| dx. 


Since sin(a:) > 0 when x is between 0 and 丌 ， we can drop the absolute values 
to get 

[sin (a:) dx. 


◎ 


which is just 2. (Check this!) Compare this example, y/1 — cos 2 (a:) dx, 
with the example ^/1 — cos(x) dx we just did. They may look similar, but 
the trig identities we used are different. 

Now, sometimes you have to apply a devious trick in order to use the 
above identities. If you see 1 + trig(ar) or 1 — trig(x) 5 where “trig” is some trig 
function (specifically sine, cosine, secant, or cosecant), in the denominator of 
an integral, consider multiplying by the conjugate expression. For example, 
to find 

r i 

dx. 


J sec ⑷ 

multiply top and bottom by the conjugate expression of the denominator, 
which in this case is sec(x) + 1. That is, 


sec(a:) • 


sec(x) — 


sec(x) 


sec(x ) - 


dx 


Now you can use the difference of squares formula (a — b)(a-\-b) 
the denominator to write the integral as 


a 2 — b 2 on 


sec ⑷ • 


sec 2 (a:) — 


- dx. 


Aha, the bottom is just tan 2 (a:), by one of our trig identities in the boxes 
above. Rewriting the integral using this, then splitting it into two integrals, 
we find that our integral is 


sec ⑷ 


tan 2 ⑷ 


■ dx 


=/ 


sec(x) 
tan 2 (x) 


dx + 


tan 2 ⑷ 


dx. 


The first of these integrals looks a little nasty, but you can save the day by 
converting everything to sines and cosines. Specifically, 




1/ COS( ： T) 


sin 2 ⑷ / cos 2 (a:) 


-I 


cos(x) 

sin 2 (x) 
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S The next step is to substitute t = sin(a:), since dt = cos(x) cte is on the top. 

Try this and see what you get. A fancier way is to rewrite cos(rr)/ sin 2 (a:) as 
/ '〆 csc(rr) cot ⑷， so 



J esc ⑻ cot (a:) dx = 


— csc(x) + (7, 


since the derivative of csc(x) is — csc(ar) cot (a:). Now we still have to deal with 
the second integral, 

/ tan 2 (a;) dX ' 

No problem — rewrite this as / cot 2 (a:) dx, then use another of the trig identi¬ 
ties from the boxes above to express this as 


J (esc 2 (a:) — l) 


dx = — cot (a:) — x-\-C. 


(Did you remember the integral of esc 2 ⑻？ It is a close cousin of the integral 
of sec 2 ⑷， which is tan(a:) + C. Just put “co-” in front of everything and 
throw in a minus sign to get the esc 2 (a;) version!) In any event, we put these 
two pieces together to conclude that 


———- dx = — csc(x) — cot(x) — x C. 
sec(x) — 1 



Let’s look at the third family of identities, the so-called products-to-sums 
identities: 


cos(A) cos(B) = ^(cos(A - B)-\- cos(A + B)) 
sin(A) sin(B) = ^(cos(A — B) — cos(A + B)) 
sin(A) cos(B) = i(sin(A — B) sin(A + B)). 



It’s quite a pain in the butt to remember these. Actually, they all follow 
from the expressions for cos (A ± B) and sin (A ± B) (which are also in Sec¬ 
tion 2.4 of Chapter 2), so if you have those down, you can reverse engineer 
the above identities from them. These identities are quite indispensable for 
finding integrals like 

J cos(3a:) sin(19a:) dx. 


Indeed, it looks like we need the third formula above with A = 19x and 
B = 3a;. (Don’t let the order of the cos and sin fool you here! The integral is 
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the same as / sin(19a:) cos(3x) dx.) So we use the identity to get 
J cos(3a:) sin(19x) dx =、 j (sin(19$ — 3a:) + sin(19a: + 3x)) 


cos(16a:) cos(22a :)、 


2 V 16 22 

cos(16a:) cos(22a:) 


+ C 


+ c. 


19.2 Integrals Involving Powers of Trig Functions 


Now we’ll see how to find certain integrals which have powers of trig functions 
in their integrands. For example, how would you find f cos 7 (x) sin 10 (a:) dx or 
f sec 6 (x) dx? Unfortunately, these types of integrals require different tech¬ 
niques, depending on which trig function or functions you’re dealing with. 
So, let’s take them one at a time. 


19.2.1 Powers of sin and/or cos 


Our example / cos 7 (a:) sin 10 (a:) dx from above fits into this category. Here’s 
the golden rule: if one of the powers of sin ⑷ or cos(:c) is odd, then grab it 
I and don’t let it get away — it is your friend! (If they are both odd, then take 
the one with the lowest power as your friend.) If you’ve grabbed your odd 
power, then you need to pull out one power to go with the dx; then deal with 
what’s left (which is now an even power) by using one of the identities 


cos 2 (a:) = 1 — sin 2 ⑷ or sin 2 (a:) = 1 — cos 2 (a:). 



Note that these are just rearrangements of the identity sin 2 (a;) 4 - cos 2 (a;) = 1. 
Anyway, the best way to see how the technique of pulling out one power 
from the odd power works is by looking at an example. In particular, to find 
f cos 7 (a:) sin 10 (a:) dx, note that 7 is odd, so we grab cos 7 (a:) and pull out just 
one cos(a:) to go with the dx. We get 


J cos 7 (x) sin 10 (a:) dx = J cos 6 (a:) sin 10 (a:) cos (a:) dx. 


So what? Well, we need to deal with the cos 6 (a:) which is left over. Now 6 is 
even, so we can write cos 6 (a:) = (cos 2 (a:)) 3 = (1 — sin 2 (a;)) 3 , and the integral 


becomes 


/(l — sin 2 ⑷ ) 3 sin 10 ⑷ cos(a:) dx. 


Now if we put t = sin(a:), then dt = cos(ar) dx, so it’s easy to get this integral 
over to t-land — it’s just 


(1- t 2 )H w dt = J{1- 3t 2 + 3t 4 - t 6 )t 10 dt = J {t 10 - 3t 12 + 3t 14 - 1 16 ) dt, 
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which works out to be 


11 13 5 17 


Converting back to ar-land, we get our answer: 


r (x) sin 10 (a:) dx = 


\x) sin 15 (a:) sin 17 ⑷ 


You see how stealing one power of cos (: r) allowed us to change the rest of the 
integrand so that it only involved sin(a:), leaving the cos ⑷ to take care of 
the dx via the substitution t = sin(ar). 

Now, what if neither power is odd? Well, if both powers are even — for 
example, if you had to work out / cos 2 (a:) sin 4 (a;) dx — you should use the 
double-angle formulas. We just saw them in the previous section, but here 
they are again for reference: 


cos 2 (a;) = - (1 + cos(2a:)) 


sin 2 ⑷ =-(1 — cos(2a:)) _ 


Now you can just replace everything in sight, and you’ll get a whole bunch of 
simpler integrals which are various powers of cosines. You then need to find 
them using the same techniques as we have just used, depending on whether 
the power in each integral is even or odd. In our example, we need to think 
of sin 4 ⑷ as (sin 2 ⑷）， so we get 


l (x) sin 4 (a:) dx 


_ cos(2a:)) ( - (1 — cos(2a:)) ) dx. 


Now we expand and multiply to get 

臺 / (1 — cos(2a:) — cos 2 (2a:) + cos 3 (2a:)) dx. 

We need to break this up into four integrals. Let’s not worry about the | out 
front or the minus signs for the moment; we’ll take care of them later. The first 
two integrals are easy, since f ldx = x + C and f cos(2a:) = \ sin(2a;) + C. 


double-angle formulas again, but with x replaced by 2x: 


How about / cos 3 (2a:) dx? Well, now we have an odd power (namely 3)，so we 
grab it! Let’s write the integral as J cos 2 (2a:) cos(2a:) dx and replace cos 2 (2$) 
by (1 — sin 2 (2a:)). Substituting t = sin(2x)，we have dt = 2 cos(2a:) cte, so the 
integral / cos 3 (2a;) dx is 
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(Pause to catch breath.) Now we put it all together and simplify a little; you 
should check that we get 



_lf sin(2a:) x sin(4a:) sin(2a:) sin 3 (2a :)、 ^ 

= 8 ^ 2 2 S -+ ~ 6~) +C 

_ x sin(4x) sin 3 (2a:) ( ^ 

= 16 64 48 — + . 

Make sure you can fill in all the details. 


19.2.2 Powers of tan 

Consider / tan n (a:) dx, where n is some integer. Let’s look at the first couple 
of cases. For n = 1, we need to know how to do / tan ⑷ dx. This is a pretty 
standard integral, which you can solve by setting t = cos (a:), noting that 
dt = — sin(a:) dx: 

J tan ⑷ dx = j =((:)) dx = - J y = - In ⑷ + C = - ln|cos(a:)| + C. 

The answer can also be written as In | sec (a:) | + C. (Why?) 

How about n = 21 For this case, and indeed other cases, it’s essential to 
use the Pythagorean identity 

tan 2 (a:) = sec 2 (a:) — 1 


which we looked at in the previous section. So we have 


◎ 


tan 2 (a:) dx = J (sec 2 (a:) — l) 


dx = tan(a:) — x-\-C. 


To do higher powers (n > 3), you have to extract tan 2 (a;) and change 
it into (sec 2 (a:) — 1). This gives you two integrals. The first can be done 
by substituting t = ta,n(x) and using dt = sec 2 (a;) dx. The second is a lower 
power of tan ⑷ and you can just repeat the method. For example, how would 
you find / tan 6 ⑷ dx? Let’s see: 


J tan 6 (x) dx = J tan 4 (a;) tan 2 (a:) dx = J tan 4 (a:) (sec 2 (a:) — l) dx 
= J tan 4 (a;) sec 2 (a;) dx - j tan 4 (x) dx. 


So now we have to work out two integrals. To do the first one, set t = tan(x); 
as we just said, dt = sec 2 (x) dx. This gives 


J tan 4 (a:) sec 2 (a:) dx = J t 4 dt = ^ C ■ 


tan 5 (a:) 


+ C. 










416 • Techniques of Integration, Part Two 


Now, the second integral is / ta l n 4 (x)dx, so we have to repeat the whole 
process. Take out a factor of tan 2 (a:) and change it to (sec 2 (or) — 1 )： 

J tan 4 (a:) dx = J tan 2 (x) tan 2 (x) dx = J tan 2 (x) (sec 2 (a:) — l) dx 

= J tan 2 (a:) sec 2 (a:) ~ J tan 2 (a:) dx. 

Once again, we have two integrals. To do the first, let t = tan(a:), so that 
dt = sec 2 (x) dx (sound familiar?). So 

J tan 2 (a:) sec 2 (ar) dx = J t 2 dt -\-C = 仏〜 ㈤ -f c. 

Meanwhile, we saw above that 

tan 2 (a:) dx = j (sec 2 ㈤ -1) dx = tan(x) -x + C. 

togeth 

J tan 6 (a:) dx ■ 


Putting it all together (being careful not to forget the minus signs), we see 
that 

tan 5 (a:) tan 3 (a:) 


+ tan(x) — x-\-C. 


What a pain. Still, it could be worse: 


19.2.3 Powers of sec 



Yup, this one really sucks, except for J sec 2 (a:) dx, which is easy. Let’s start 
with the first power, f sec (a:) dx. There are many ways of finding this in¬ 
tegral. The easiest involves a cool trick that is well worth remembering, as 
it’s a real timesaver. Unfortunately it’s the sort of trick that is completely 
counterintuitive, and it boggles the mind that anyone even thought of it in 
the first place. The idea is to multiply top and bottom by the bizarre quantity 
(sec(a:) + tan(a:)). Watch and be amazed: 


sec(a:) dx= sec(x) x 


sec(:c) + tan (: r) ^ _ f sec 2 (x) + sec(x) tan(x) ^ 
sec(x) + tan(x) J sec(x) H- tan(x) 


=ln|sec(a:) + tan ⑷ | + C, 


since the derivative of the denominator sec(a:) + tan(x) is miraculously equal 
to the numerator. 

How about the second power of sec(a:)? Not much to this one: 


J sec 2 (x) dx : 


tan(a:) + C. 


That was easy. Unfortunately, it gets pretty messy for larger powers. The 
H/u standard idea is to pull out sec 2 (a:) (which is similar to what we did with 
I powers of tan ⑷） and integrate by parts, using dv = sec 2 (x) dx and u as the 
rest of the powers of sec(ar). This means that v = tan(a:) (remember, we don’t 
need a constant here). When you do the integration by parts, you will of 
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course get a new integral; the integrand should be a lower power of sec(x) 
multiplied by tan 2 (a;). Once again, we have to use tan 2 (a:) = sec 2 (a:) — 1 and 
get two integrals. One of them is a multiple of the original integral! You 
have to put this back on the left-hand side. The other one is a lower power 
of sec ⑷， and you have to repeat the whole process until you get down to 
f sec(x) dx or J sec 2 ⑻ dx, both of which we now know how to do. 

That was quite a technical explanation. Let’s see a formidable example: 
find J sec 6 (a:) dx. Start off by breaking out sec 2 (a:), like this: 


c 6 (a:) dx 


J sec 4 (a;) 


sec 2 (a:) dx. 


Now integrate by parts with u = sec 4 (x) and dv 
ating u and integrating dv as usual, we find that 


sec 2 (a:) dx. By differenti- 


du = A sec 3 (a:) sec (a:) tan(a;) dx = A sec 4 (a:) tan(a;) dx and v = tan(x). 


So now we can integrate by parts to get 


sec 4 (a:) sec 2 (a:) dx = sec 4 (a:) tan(a:) — / tan(a:) 4 sec 4 (a:) tan(x)dx. 


Let’s look at the integral on the right-hand side. We can write this as 
4 J sec 4 (a;) tan 2 (a:) dx = 4 ： j sec 4 (a:) (sec 2 (a:) — l) dx 

= 4 (/ sec 6 ⑷ dx — j sec 4 ⑷ cte). 


Putting it all together, we have 


J sec 6 {x) dx = sec 4 (x) tan(a;) - 4 / sec 6 (a;) dx + ^J sec 4 {x)dx. 


Now comes the sexy part: transfer the first integral on the right-hand side 
over to the left-hand side to get 


5 J sec 6 {x)dx = sec 4 ⑷ tan(a:) +4 J sec 4 (x) dx. 


We can divide this equation by 5 to get 


sec 6 (a;) dx = ^ sec 4 (a;) tan(a:) + 書 / sec 4 (a;) dx. 


^ Are we done? No, we still need to know how to do / sec 4 ⑷ dx\ We just have 
to repeat the whole darn process again. Here’s where it’s your turn to repeat 
all the above steps. If you don’t screw up, you should get 


J sec 4 (a:) dx = ^ sec 2 (a:) tan(a:) + 誉 / sec 2 (a:) dx. 
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◎ 
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19 . 2.4 



19.2.1 
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Now we need / sec 2 (a:) dx, but we’ve finally knocked this down to something 
we can do —— it’s just tan(x) + C, as we’ve already seen. Putting it all together, 
we have 


J sec 6 (a:) dx = ^ sec 4 (a:) tan(ar) + 香 (* sec 2 (a:) tan(a:) + 誉 tan(a:)) + C 
=^ sec 4 (a;) tan (: r) + 去 sec 2 (a:) tan(a:) + ^ tan (a:) + C. 


Man, I’m exhausted just writing about this. Look, the idea with powers of 
both tan(a:) and sec ⑻ is to knock the power down by 2 and then repeat; 
keep going until you either get down to the first or second power, which you 
can just do directly. By the way, how would you do 


j ^(xy 


That’s right, you write it as / sec 6 (a;) da:, of course (which we just worked 
out!). How about 


f sin 2 (a:) 
J cos 3 ⑷ 


dxl 


Write the numerator as 1 — cos 2 (x) and break up the integral: 


f sin 2 (a:) . fl — cos 2 (x). 
J^m dX = J CO S 3(,) ^ 


J sec s (x) dx . 


sec ⑷ dx. 


Now use the techniques above to find these two integrals involving powers of 
sec ⑷. 


Powers of cot 

These work just like powers of tan ⑷. You pull out cot 2 (a:) and use the 
Pythagorean identity 

cot 2 (a:) = esc 2 ⑷ — 1. 

Just beware that when you set t = cot (a:), you have dt = — csc 2 (x) dx. That 
is, don’t forget the minus sign! Now try doing a few for practice. For example, 
try f cot 6 (a:) dx and compare your answer with the solution to / tan 6 (a:) dx 
in Section 19.2.2 above. You will see that they are very similar indeed. 


Powers of esc 

These work just like powers of sec(a:). You pull out esc 2 (a:) and integrate by 
parts, using dv = esc 2 (a:) dx. Beware: you now have v = — cot ⑷， and du 
also involves a minus sign which you have to worry about. Again, try some 
examples. If you work out / esc 6 (a:) dx and compare your solution to the 
worked example / sec 6 (a:) dx from Section 19.2.3 above, you should see more 
than a passing resemblance. 
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◎ 


The methods of the last four sections all involve knocking the power of the 
trig function you’re dealing with down by 2, then repeating the process. For 
example, in Section 19.2.2, we saw that we can integrate a power of tan(x) 
by extracting tan 2 (x) and replacing it by sec 2 (a:) — 1. Let’s try to write out 
the method in general. First, we’re dealing with f tan n ($) dx, so we’ll give it 
a name: I n (for integral number n). That is, 


I n = j tan n (a:) dx. 

We already know that 

Iq = j tan 0 (a;) dx — 
h = [ tan(ar) dx = — ln|cos(a:)| + C. 


ldx = x -\-C 


and 


Now, when n > 2, we can steal tan 2 (a:) away from tan n (a;), leaving behind 
tan n_2 (a:); then we can use our trig identity and split up the integral to get 

I n = j ta 1 n n (x)dx = J tan n_2 (ar) tan 2 (a:) (fe = / tan n_2 (a:)(sec 2 (a:) — l)dx 

= J tan n ~ 2 (x) sec 2 (a;) dx — J tan n_2 (a:) dx. 

The second integral in this last expression, / tan n_2 (a:) dx^ is just I n -2 - As for 
the first, if you put t = tan(a:) so that dt = sec 2 (x) dx, you’ll see it becomes 
f t n ~ 2 dt, which is just t n ~ x j(n —1) + (7. Replacing t by tan(a:), we have 
shown that 

In - - r tan n_1 (a:) — / n - 2 * 

n — 1 

There’s no need for a constant, since both I n and I n -2 are indefinite integrals. 
The above equation is called a reduction formula, since it helps us reduce the 
number n to a smaller number n — 2. 

Let’s see how to use the formula to find / tan 6 (a:) dx. This is just Iq. So, 
put n = 6 in the reduction formula to get 

Iq = ^ tan 5 (a:) — I^. 

OK, so we need I 4 . Let’s write out the reduction formula again, this time 
with n = 4: 1 

Z 4 = ^ tan 3 (a:) — 1 ^. 

Once again, but with n = 2: 

/2 = j tan 1 (a:) — Io = tan(a:) — x C, 

where we have used the above formula for Iq. So we now know I 2 , and we 
can work backward to get I 4 : 


/4 = - tan 3 (a:) — I 2 = - tan 3 (a:) — tan(a:) -\-x-\-C. 
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◎ 


Finally, we can find our desired integral, which is none other than Iq ： 

J tan 6 (a:) dx = Iq = ^ tan 5 (a:) — h = ^ tan 5 (a:) — ^ tan 3 (a:) + tan(a:) -x-\-C. 

This agrees with our answer from Section 19.2.2. Now try to repeat this for 
powers of secant, cosecant, and cotangent — the methods are given above, and 
all you have to do is rewrite them as reduction formulas. 

The method also works for definite integrals. For example, how would 
you find the definite integral cos 8 (a:) dxl You could use the double-angle 

formulas, as described in Section 19.2.1 above, but that would be a pain in 
the ass. (Try it if you don’t believe me!) Instead, let’s set 


r/2 


In = 


cos n (a:) dx 


and make a mental note that we eventually want to find Is. The trick now is 
to pull out one factor of cos (: r), like this: 


In = 


r»7r/2 /*7r/2 

cos 11 (x) dx = j ( 


L (a:) cos (a:) dx. 


Now integrate by parts with u = cos n ~ 1 (x) and dv = cos(x) dx. This means, 
of course, that v = sin(x). (See Section 18.2 in the previous chapter for more 
about integration by parts.) I leave it to you to show that we get 


I n = cos n_1 (a:) sin(a:) 


|V2 


rV 2 


(n — 1) cos n — 2 (;r) sin 2 (a:) dx. 


If n > 2, then the first expression on the right-hand side is 0, since we have 
cos(7r/2) = 0 and sin(0) = 0. On the other hand, we can replace sin 2 ⑻ by 
1 — cos 2 ⑻ in the integral to see that 


In = 


(n — 1) cos n_2 (a:)(l — cos 2 (a:)) dx 

/ *7t/2 z-tt/ 2 


(n- 1) 


)s n 一 2 (a:) dx — (n — 1) / cos n (a:) dx. 


Now what? Well, notice that the last two integrals are just I n -2 and I n , 
respectively. So 

I n = (n- l )/ n -2 - (n- l)I n . 

Solving for I n by adding (n — l)I n to both sides and dividing by n, we arrive 
at the following reduction formula: 


That should make life a lot easier! In particular, we are looking for /g, so by 
using the above formula over and over again, with n = 8, then n = 6, then 
n = 4, and finally n = 2, we get 


l8 = l h = l-l h = l-l-l h ^ 


7 5 3 1 

8 * 6 * 4 * 2 /o * 
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Now we need to find Iq. Since cos 0 (a:) is just 1, we have Io = ldx = 7r/2. 
Simplifying the above fraction, we have shown that 


cos 8 (a:) dx = 



7T 357T 
2 = 256* 



As a bonus, we can easily find cos n (a:) dx for any other positive integer 
n. (You’ll need to note that I\ = cos ⑷ da: = 1 in order to get the odd 

powers.) 

By the way, reduction formulas don’t have to involve trig functions. For 
example, if 


I n = J x n e x dx ， 

then you can integrate by parts with u = x n and dv = e x dx (so v = e x ) to 


get 


I n = x n e x — J nx n ~ 1 e x dx. 

This gives the reduction formula I n = x n e x — nl n _i. Incidentally, unlike the 
situation with all the trig function examples, this time I n is expressed in terms 
of / n _i, not I n -2 - So you only need to know Iq at the end of the chain, which 


isn’t hard to find: Iq = f e x dx = e x + C. 


19.3 .tntegroJs InvOlyng Trig Substitutions 

Now let’s look at how to do integrals involving an odd power of the square 
root of a quadratic. Here are some examples of the type of integral we’re 
considering: 

0r ° r / ( " 2 + 15)_5/2 ^ 

The basic idea is that there are three types, corresponding to whether you 
have to worry about a 2 — x 2 , x 2 + a 2 , or x 2 — a 2 . Here a is just some number. 
For example, the first integral above involves x 1 — a 2 with a = 2, the second 
involves a 2 — x 2 with a = 3, and the third involves x 2 -\-a 2 with a = y/15. Each 
of these three types requires a different substitution. Most of the time, after 
substituting, you end up with an integral involving powers of trig functions, 
which is where the previous section comes in. Let’s look at the three types of 
integrals one at a time; then we’ll summarize the whole situation at the end. 


19.3.1 Type 1 : Va 2 - x 2 

If you have an integral involving an odd power of Va 2 — x 2 , the correct sub¬ 
stitution to use is a: = asin(0). (You could use x = acos(0) if you prefer, but 
there would be no advantage to it, so stick with sine.) The reason that this 
substitution is effective is that 

a 2 — x 2 = a 2 — a 2 sin 2 ( 0 ) = a 2 (l — sin 2 ( 6 )) = a 2 cos 2 (0), 
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and now you can easily take a square root. Remember that if you are changing 
variables from x to 6 , you have to go from 工 -land to 0-land. That is, everything 
about the integral has to be in terms of 9, not x. In particular, we’ll need 
to replace dx by something in 6 and d6. No problem — just differentiate the 
equation x = asin(0) to get dx = a cos(9) dO. (This sort of substitution, where 
the equation is solved for x instead of the substituting variable, was discussed 
at the ends of Sections 18.1.2 and 18.1.3 of the previous chapter.) Anyway, 
now we can hopefully do the integral in 0-land, but in the end we have to 
change the answer back to x-land. To do this, it will be useful to draw the 
following right-angled triangle with one angle equal to 6 : 



Now we know sin(6) = x/a, so we can fill in two of the sides as shown: 



Finally, we can use Pythagoras’ Theorem to see that the third side is \lo? — x 2 , 
so we complete the triangle as follows: 



x 




— X 


2 


◎ 


Now we can easily read off from this triangle the values of cos(0), tan(0), or 
any other trig function of 6 , and get back to rr-land without too much trouble. 
Let’s see how it works in practice. We’ll use an example from above: 

/ ( 9 - x 2 )3 /2 dx - 

We make the substitution x = 3sin(0), so dx = 3 cos(0) dO. Also, we see that 


9 — a: 2 = 9 — 9sin 2 (0) = 9 cos 2 (0). So the integral becomes 


(3sin(0)) 2 
(9 cos 2 ⑹ ) 3 / 2 


3 cos(0) d6 : 


3 2 x3 

, 


sin 2 ((9) 

COS 3 (0) 


cos(0) dO 


J tan 2 (0) d6, 
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since 9 3 / 2 = 27. Now we use the techniques from Section 19.2.2 above to see 
that 

—i) d0 = tan(0) — 6 -\-C. 

We just have to get back to a>land. Since sin(0) = x/3, the relevant triangle 
looks like this: 


x 


\/9 — x 2 

We can read off from the triangle that tan(0) = x/V^ — x 2 . Also, since 
sin(0) = x/3, we have 0 = sin _1 (a:/3). Substituting into the answer above, 
we see that 

l (9 -^)3/2 dx = 7f=f- Sin_1 (l) +C - 
If you didn’t use the triangle, you might be tempted to write tan(0) as the 
messy expression 

tan (sin- 1 (D), 

but I hope you agree that our actual answer above is preferable. 

Before we go on to Type 2, do you see that we’ve been a little careless 
here? We had to work out (9 cos 2 (0)) 3 / 2 and just claimed that it is 27 cos 3 (0). 
Certainly 9 3 ’ 2 = 27, but is it always true that (cos 2 (0)) 3 / 2 = cos 3 (0)? Actu¬ 
ally, this is only true if cos(6) > 0. The problem is that raising a quantity to 
the power 3/2 actually involves taking a positive square root. Indeed, for any 
positive number A, we have = (A 1 / 2 ) 3 = (vC4) 3 . So we should really 

have written _ 

(COS 2 (0)) 3 / 2 = (^/cos 2 (ff)) 3 = |cos 3 (0)|. 

Luckily, the absolute value signs turn out to be unnecessary for Type 1 and 
also for Type 2 below (but not for Type 3)，so we were right all along. This 
point will be discussed in gory detail in Section 19.3.6 below. 


19.3.2 Type 2: Vx 2 + a 2 

If an integral involves an odd power of y/x 2 + a 2 , the correct substitution is 
x = atan(0). This works because 

x 2 a 2 = a 2 tan 2 (0) + a 2 = a 2 (tan 2 (0) + 1) = a 2 sec 2 ( 6 ). 

Also, we’ll need to know that dx = asec 2 (0) d9. Since tan(0) = x/a, the 
triangle now looks like this: 
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and you can easily take square roots. To make the substitution, we’ll also 
need the fact that dx = asec(6) tan(0) dO. Since sec(0) = x/a, the triangle 
looks like this: 


◎ 





set x = 2sec(0), so dx = 2 sec(0) tan(0) dO and x 2 —4 = 4tan 2 (0). The integral 
becomes 


2 sec(0) tan(0) 


2 sec(0) tan(0) 


(2 sec(0)) 3 \/4tan 2 (^) 


■d6 


8 sec 3 (0) x 2 tan(0) 

I ^{e) d6 = l I cos2{e)d9 - 


Actually, this time it’s wrong to replace \/4tan 2 (0) by 2tan(0); this is only 
correct if a: > 0 in the original integral, as we’ll see in Section 19.3.6 below. So 
let’s make that assumption. Now we need to find | f cos 2 (0) dO. The power of 
cosine is even, so we have to use the double-angle formula from Section 19.2.1 
above: 

11 cos ，) d6 = \jV l + cos ( 加 )) de = Ye + f) 

OK, we just have to get back to 工 -land. This is a little tricky, even using the 
appropriate triangle: 


Vx 2 -4 ： 


+ C. 


The problem is that we need to know what sin(20) is. To do this, we use the 
identity 

sin(20) = 2 sm(0) cos(0). 

Then we can use the above triangle to see that sm(0) = y/x 2 — 4/x and 
cos(0) = 2 /x, substitute everything in, and get 

Vx 2 -4 2 


dx 


1 —1 fx\ ■ \4c 2 -4 

_ 16 SeC (2) 


+ C 


8 x 2 


+ C. 
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Remember, this only applies when x > 0. We’ll revisit this example in Sec¬ 
tion 19.3.6 to see how to take care of the case when x < 0. 


19.3.4 Completing the square and trig substitutions 



Now, one other important point before we summarize the situation. From 
time to time, you might want to solve an integral involving an odd power 
of \ / 士 x 2 ax -\-b. That is, you now have a linear term ax to complicate 
matters. The technique is simple: complete the square first and substitute 
to get it into one of the three types that we’ve investigated. For example, to 
evaluate 

/(? -Ax + 19)~ 5/2 dx, 


first complete the square (see Section 1.6 of Chapter 1 for a reminder of how 
to do this): 


a: 2 - 4a: + 19 = (a; 2 - 4x + 4) - 4 + 19 = (a: - 2) 2 + 15, 

So the integral we want is actually 

J ((x-2) 2 + 15)~ 5 / 2 dx. 

Now let t = x — 2, so dt = dx, and in t-land the integral becomes 

J {t 2 + 15)~ 5/2 dt, 

which we have already done earlier in Section 19.3.2! The answer was (replac¬ 
ing the old x by t) 


^2 + 15 ~ 3(^2 + 15)3/2 J +C ， 
so replacing t now by a: — 2, we see that 

I {X2 ~ 4X+ 19) " 5/2 血 =▲ (v^-4x + 19 _ 3(^4:? 19 产 

The moral of the story, both here and when using partial fractions, is that 
a quadratic with a linear term can be made into a quadratic without one by 
completing the square and substituting. 

19:,3 3: -SumrnctfV OfJrig sOtestitutioni 


) +C . 



To summarize the three main types we’ve looked at, here’s a table that shows 
the appropriate substitutions and triangles for each type: 
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Type 1: y/a 2 — x 2 

Type 2: y/x 2 a 2 

Type 3: \Jx 1 — a 1 

Set x = asin(6) 


Set x = a tan(0) 


Set x = a sec(0) 


dx = a cos(0) d6 


dx = asec 2 (0) d6 


\ dx = a sec(6) tan(0) dO 

a 2 — x 2 = a 2 cos 2 (6) 


x 2 a 2 = a 2 sec 2 (6) 


X 2 — a 2 = a 2 taxi 2 (6) 


^0 r 

X 


X 

〆 「 

y/ x 2 — a 2 

\/ a 2 — x 2 


a 


a 



The next section discusses the technical point about when (and why) you can 
drop the absolute value signs when you take square roots of quantities like 
a 2 cos 2 (0) or a 2 tan 2 (0). It’s the sort of thing that you may want to skim over 
first, then come back to later if you have time. 

19.3.6 Technicalities of square roots and trig substitutions 

You have been warned: this section gets a little messy. Still with me? Good. 
Now, think back to Type 1 above. We simplified \Ja cos 2 (0) down to acos(0), 
completely ignoring the need to use absolute values around the cos(0). Actu¬ 
ally, when we write x = asin(0), we really mean that 9 = sm~ 1 (x/a). 

So where is 61 Well, from Section 10.2.1 in Chapter 10, we know that 
the range of sin -1 is [—7 t/2, 7t/ 2]; this means that 9 is in the first or fourth 
quadrant, so cos(6) is always nonnegative. We don’t need any absolute values! 
The same goes for Type 2. In that case, we’d really like to simplify 
a 2 sec 2 (0) as asec(6). Can we do this without using absolute value signs? 
We have x = atan(0), so 0 = ta,n~ 1 (x/a). The range of tan -1 is (—7r/2,7r/2), 
so 6 is once again in the first or fourth quadrant. This means that sec(0) is 
always positive, so again, we don’t need absolute values. 

Everything goes wrong in Type 3, unfortunately. Here we need to deal 
with \/a 2 tan 2 (6), but this isn’t always equal to atan(0). You see, since 
x = asec(0), we have 6 = sec~ 1 (x/a). If you look back at Section 10.2.4 in 
Chapter 10, you’ll see that the range of sec -1 is the interval [0,7r], except for 
the point tt/2. So 0 is in the first or second quadrant, and tan(0) could be 
positive or negative. At least it has the same sign as x does, as you can see 
by looking at the graph of y = sec -1 (a:). 

So, it’s correct to write \/a 2 tan 2 (6) = atan(0) when a: > 0. On the other 
hand, if a: < 0 then you have to write — atan(0) instead. In that case, the 
triangle actually looks like this: 


— \/x 2 — a 2 


a 

I agree that it’s freaky that this triangle has two negative sides (x and 
—y/x 2 — a 2 ), but it works as a neat memory device, since all the signs of 
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yjx 2 — 4 
8x 2 

\/x 2 — 4 
8a: 2 


+ Ci when x > 2, 


+ C 2 when x < —2. 


the trig functions are correct. In our example 
f dx 


x s Vx 2 — 4 


from Section 19.3.3 above, we saw that the integral works out to be 


XI) 


y/x 2 — 4 ： 
Sx 2 


+ C 


when x > 0. (Actually, if a: > 0, then x has to be greater than 2, or else the 
\/x 2 — 4 factor in the denominator really screws up the situation.) Now let’s 
redo the problem for the case when x < 0. We still substitute x = 2sec(0), 
but now we must replace \J 4 tan 2 (6) by — 2 tan(0). The only difference from 
before is the minus sign: 


dx 


x^yjx 1 — 4 


-I 


2 sec(0) tan(0) 


(2 sec(0)) 3 \J A ： tan 2 (0) 
2 sec(0) tan(0) 


de 


J 8 sec 3 (0) x (—2tan(0)) 

= -\jcos\e)de= 6 2sinWcos(0) 


16 


32 


+ C. 


Migrating back to x-land, we have to use a modified triangle: 



-Vx 2 -4 ： 


So in fact sm(6) = —yjx 2 — 4/x and cos(0) = 2/x. Notice that sm(6) is 
actually greater than 0, since a: < 0. Anyway, substituting back into the 
above integral, we see that 


dx 


x^Vx^4 = ~16 Sec l ( 2 ) 

1) 


-y/x 2 — 4 ： 2 


16 


32 

\/x 2 — 4 

8尤 2 


+ C 


+ C. 


So, that’s the answer when x < 0. It’s almost the same as the previous 
answer, but the inverse secant term needs a minus sign out front. Also, the 
constant C is potentially different from the other C which arises when a: > 0. 
Why? Because we are looking for a function whose derivative is Xjx^yjx 1 — 4, 
which itself has domain (— 00 ,—2) U (2, 00 ). So the antiderivative is also in 
two pieces, either of which can be shifted up or down independently of the 
other. All in all, the complete answer is 


\ / \ / 
X -2 X -2 
/ —\ / —\ 


16116 
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Here Ci and C 2 are potentially different constants. Actually, we’ve already 
encountered an integral where two constants should be involved: / 1/xdx. See 
Section 17.7 in Chapter 17 for more details. In practice, problems involving 
Type 3 are often phrased (or intended to be phrased) with the condition that 
a: > 0. This allows one to avoid all the above mess and take square roots 
without a care in the world. Just beware: if a: < 0, then you need to be a lot 
more careful.... 

19.4 Overview of Techniques of Integration 

We’ve now built up quite a toolkit of techniques of integration. Now the 
HtU question is, given an integral, which technique do you use? Sometimes it’s 
I not easy, and you may have to try several different methods until you hit upon 
the right one. Sometimes you even need to combine the methods. Here are 
some general guidelines to help you out: 

• If an “obvious” substitution comes to mind, try it. For example, if one 
factor of the integrand is the derivative of another piece of the integrand, 
try substituting t for that other piece. 

• If something like y/ax H- b appears in the integrand, try substituting 
t = y/ax -\-b^ as described in Section 18.1.2 of the previous chapter. 

• To integrate a rational function (that is, a quotient of polynomials), 
see if the top is a multiple of the derivative of the bottom. If so, you 
can just substitute t = denominator. Otherwise, use partial fractions 
(Section 18.3 of the previous chapter). 

• After checking that no obvious substitution looks as if it will work, 
use the techniques from the beginning of this chapter to find integrals 
involving: 

— functions containing ^/1 + cos(x) or y/1 — cos(x): in this case, use 
the double-angle formula; 

— functions involving one of 1 — sin 2 (a:), 1 — cos 2 (a:), 1 H- tan 2 (a:), 
sec 2 (a:) — 1, esc 2 ⑷ 一 1, or 1 + cot 2 (a:): in this case, use one of the 
Pythagorean identities sin 2 ⑷ +cos 2 ⑷ = 1 ， tan 2 (x) +1 = sec 2 (a:), 
or 1 + cot 2 (ar) = esc 2 (a:); 

— functions with 1 士 sin(x) (or similar) in the denominator: in this 
case, multiply and divide by the conjugate expression and try to use 
the Pythagorean identities; 

— functions containing products like cos(ma;) cos (no:), sin(ma:) sin(na:), 
or sin(ma:) cos(nx): in this case, use the products-to-sums identities; 
or 

— powers of trig functions: you’ll just have to learn the individual 
techniques in Sections 19.2.1 through 19.2.5 above. 

• If the integrand involves Vx 2 — a 2 or any odd power of this (for example 
(x 2 — a 2 ) 3 / 2 , (pc 2 — a 2 ) 5 / 2 , and so on), or y/x 2 -f a 2 or \/o? — x 2 or an 
odd power of any of these last two, then use a trig substitution (after 
checking that there’s no obvious substitution). If the quadratic includes 
a linear term, complete the square first. See Section 19.3 above for more 
details. 
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[f the integrand is a product and no obvious substitution comes to mind, 
try integration by parts. (See Section 18.2 of the previous chapter for 



[f no substitution appeals, then a good rule of thumb is that functions 
involving a power of ln(a:) or an inverse trig function should be integrated 
by parts. In that case, let u be the power of ln(a:) or the inverse trig 
function as appropriate. For example, how would you find 



First check that no substitution appeals; since nothing springs to mind, 
think of integration by parts. Wait a second, it’s not a product! Wait 
another second, quotients are products too! Just rewrite the integral as 



bhen integrate by parts with u = ln(l + x 2 ) and dv = (1/a: 2 ) dx. Try it 
now — you should get the answer 



if you memorize all the above techniques, you will be lost in a sea of 
;ion unless you practice a whole load of problems. Make sure that at 


of which method to use on which integral. Then you will truly be a bad-ass 
integrator. 









CHAPTER 20 


Improper Integrals: Basic Concepts 


This is a difficult topic, so I’m devoting two chapters to it. This chapter 
serves as an introduction to improper integrals. The next chapter gets into 
the details of how to solve problems involving improper integrals. If you are 
reading this chapter for the first time, you should probably take care to try 
to understand all the points in it. On the other hand, if you are reviewing 
for a test, most likely you’ll want to skim over the chapter, noting the boxed 
formulas and the sections marked as important, and concentrate on the next 
chapter. Here’s what we’ll actually look at in this chapter: 

• the definition of improper integrals, convergence, and divergence; 

• improper integrals over unbounded regions; and 

• the theoretical basis for the comparison test, the limit comparison test, 
the p-test, and the absolute convergence test. 

We’ll revisit all four of these tests in the next chapter and see many examples 
of how to apply them. 

2D.1 Convergence and Divergence 

What is an improper integral, anyway? In Chapter 16， we saw that the 
integral 

f f(x) dx 

J a 

certainly makes sense if / is a bounded function on [a, b) which is continuous 
except at a finite number of places. If / has infinitely many discontinuities, 
the integral might still make sense, or it might be totally screwed up (see 
Section 16.7 of Chapter 16 for an example). What if / isn’t bounded? This 
means that the values of f(x) manage to get really large (positively or neg¬ 
atively or both) while x is in the interval [a, b]. This sort of thing typically 
happens when / has a vertical asymptote somewhere in this interval: the 
function blows up there and can’t be bounded. This causes the above integral 
to be improper. 
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There’s a different type of unboundedness that can occur even if / is 
bounded. The interval [a,b] can actually be infinite — something like [0, oo), 
[—7,00) ， (—oo,3] or even (—oo, oo). This also makes the above integral im¬ 
proper. 

So, the integral /: f(x) dx is improper if any of the following conditions 
apply ： 

1. / isn’t bounded in the closed interval [a, b ]; 

2. 6 = oo; or 

3. a = —oo. 

For now, let’s concentrate on what happens if the first of these conditions 
fails; we’ll return to the other two conditions in Section 20.2 below. As I said, 
the typical way that a function fails to be unbounded is if it has a vertical 
asymptote somewhere, although there can be more exotic types of behavior. 
(An example is f(x) = ^ sin(^), which oscillates really wildly as x approaches 
0.) If f(x) is unbounded for x near some number c, we’ll say that / has a 
blow-up point at x = c. Again, in most situations, this is the same thing as a 
vertical asymptote. 

So let’s look at the simple case of when our function / has a vertical 
asymptote dX x = a. The situation looks something like this: 



I’d be lying through my teeth if I claimed that f(x) dx is the area (in square 
units) of the shaded region. The problem is that the region actually extends 
up the page, then past the top of the page, going on and on forever, as the 
arrow is trying to indicate. The region does get skinnier as it goes up, though, 
because of the vertical asymptote. 

Since the region never stops going up, surely its area should be infinite, 
right? Not necessarily. A mathematical miracle can occur if the region is 
skinny enough, and the area can actually be finite. To see how a region can 
be unbounded yet have a finite area, we’ll use limits once again. Here’s the 
idea: let £： be a small positive number; then you can integrate / over the 
region [a + £：,&], since / is bounded there. You’ll get some nice finite number. 
Now, replay the situation but with an even smaller e. You get a new finite 
number. The situation now looks something like this: 









means that if your function is bounded and the region of integration [a, b) 
is bounded, then there’s no issue: the integral converges since it’s not even 
improper. It’s just some nice finite number, no sweat. 

Now, here’s a summary of the situation when you have a blow-up point at 

x = a: _ 

if f(x) is unbounded for x near a only, then set 


/ f(x) dx = lim / f(x) dx^ 

_ 1 _ £ ^ 0+ 人 +£ _ 

provided that the limit exists. If it does, then the integral converges; if not, 
the integral diverges. Just like any limit, the above one may fail to exist 
because it might be oo or —oo, or things might oscillate around too much as 
€ tends to 0+. 

This brings us to an important point. When we look at an improper 
integral, the most important thing we need to find out is whether it converges 
or diverges. It’s much less important to know what the integral converges to 
(assuming it converges). In practice, you can use computational techniques 
to estimate the value, but only if you know that the integral converges. If the 
integral diverges, you can get some whacked-out results if you try to use a 
computer to approximate your integral. Computers don’t really understand 
infinities or crazy oscillations (yet!). 


20.1.1 Som© 令 xgtrrtples improper inf 翊 rdt§_ 






iproper because their integrands have vertical asymptotes 
use the formula in the box above. In the first case, we have 

(ln(l) — ln(e:)) = oo. 

e facts that ln(l) = 0 and that ln(e) —> —oo as e — 0 + .) 
the improper integral J。 1 1/xdx must diverge. How about 
Using the formula again, we have 

im + J dx = lim+ 2a: 1 / 2 1 = lim + (2VT — 2y/e) = 2. 

Ite number, so the integral 1/ y/x dx converges. As it 
Lown that the integral converges to 2, but as I said at the 
tion, we don’t care that much. Our main focus is to decide 
•per integral converges, without worrying what it actually 

^oing on here? Why should the improper integral /q 1/xdx 
y/xdx converge? After all, when you think about it, the 
and y = l/\/x look roughly the same — something like this: 




;grands are not the same. Indeed, 1/x is greater than 1/y/x 
Geometrically, the graph of y = 1/y/x is actually a little 
5 than y = 1/x is. It turns out that y = l/y/x is close enough 
ike the corresponding integral converge; while y = 1/x isn’t 
e y-axis and its integral diverges. Unfortunately, there’s no 
.ssify all the functions with vertical asymptotes at a; = 0 to 
are close enough to the asymptote and which ones aren’t, 
you just have to look at each improper integral on its own 

really important point. Suppose you have an improper 
:， where / has a vertical asymptote at x = a only, and you 
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matter! You can change it to any finite number bigger than a, so long as you 
don’t pick up any new vertical asymptotes or blow-up points. To see why, 
first note that, by definition, 



lim [ f(x) dx, 
0+ Ja+e 


provided that the limit exists. Now let’s change b to some other number c 
which is bigger than a. If / still only blows up at a: = a, then we have 


f(x) dx = lim + J f(pc) dx, 


again provided that the limit exists. We can split this last integral at x 
(the technique is described in Section 16.3 of Chapter 16) to get 


J f(x) dx = lim + y J f(x) dxJ f(x)dxj . 

The second integral doesn’t depend on e at all; in fact, since / is bounded 
between b and c inclusive, that integral converges to some nice number M. 
So we have shown that 


f(x) dx = lim + J f(x) dx + M. 


If the limit on the right-hand side exists, then f(x) dx converges. Adding 
M still keeps everything finite, so /: f(x) dx also converges. If instead the 
limit doesn’t exist, then adding M doesn’t change that, so both f(x) dx 
and /: f(x) dx must diverge. 

We have shown that the convergence or divergence of an improper integral 
over a bounded region depends only on what the integrand does very close to 
its blow-up points. In particular, since we know that 1/xdx diverges, we 
can also conclude that 

r2 ^ /-100 2 / *0.0000001 

/ — dx, / — dx, and / — dx 

Jo x Jo x Jo x 

all diverge. On the other hand, since 1/y/xdx converges, we get for free 
that 

r2 i ,ioo i l*o.ooooooi i 

/ —= dx, / —= dx, and / —= dx 

Jo v x Jo V x Jo V x 

all converge. All the action goes on really near the asymptote a: = 0. 


20.1.2 Other blow-up points 

In the integral J: / ⑷ dx, if / has only one blow-up point at the right-hand 
limit of integration b (instead of a), then we can play the same game as we 
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Notice that none of these integrals has more than one problem spot, and that 
all the problem spots are at one of the limits of integration. The integrals /i, 
I 3 , and I 5 have their only problem spot at their left-hand limits of integration, 
while I2 and I4 have their problem spot on the right. The only way that 
original integral I can converge is if all five pieces I\ through 1 ^ converge. If 
they do all converge, then the value of I is the sum of the values of Ii through 
I 5 . (In fact, none of the five pieces converge! We’ll see why in Section 21.5 of 
the next chapter.) 

_.2 Integrals over Unbounded Regions 

Now, we still have to look at what happens when one or both of the limits of 
integration are infinite; this means that the region of integration is unbounded. 
To handle ^ 

j f(x)dx ， 

where a is any finite number and / has no blow-up points in [a, 00 ), let’s use 
another limiting technique. This time, we integrate over the region [a, A/ - ], 
where AT is a massively large number. This will give us a nice finite value. 
Then repeat but with an even larger N to get a new value. Continue onward 
and see what happens to the values of the integrals. If they have a limit, then 
the integral converges. Otherwise, it diverges. In symbols, we are defining 



provided that the limit exists; in this case, the integral converges. Otherwise, 
it diverges. For reasons similar to those described at the end of Section 20.1.1 
above, the value of a is irrelevant. So long as you don’t pick up any new blow¬ 
up points of /, the value of a doesn’t affect whether the improper integral 
converges or diverges. The only thing that really matters is how f(x) behaves 
when x is very large indeed. 

In a similar manner to the above definition, if / has no blow-up points in 
(— 00 , 6 ], then 



What if / has no blow-up points anywhere and we want to find 

J f(x) dxl 

Although there are no blow-up points, there are still two problem spots: 00 
and — 00 . That’s right: we are regarding 00 and —00 as problem spots when¬ 
ever they show up, since we have to treat them separately. So we have to 
split the above integral into two pieces so that each one has only one problem 
spot. Pick your favorite number (mine is 0 for the moment), and consider the 
integrals 
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20.3 The Comparison Test (Theory) 

Suppose we have two functions which are never negative, at least in some 
region of interest. If the first function is bigger than the second function, and 
the integral of the second function (over our region) diverges, then the integral 
of the first function (over the same region) also diverges. Mathematically, it 
looks like this. Let’s say we want to know something about f(x) dx, but 
we only know something about g(x) dx. If f(x) > g(x) > 0 for a: in the 
interval (a, 6 ), and we know that g(x) dx diverges, then so does / a b f(x) dx. 
In fact, since f(x) > g(x), we can write 

f(x) dx > 

So the first integral also diverges. In our example above, we’d just write 

f 1 1 7 f 1 1 7 

/ — dx> / — ax = oo, 

JO x Jo x 

and conclude that the left-hand integral diverges. Of course, we had to know 
that the right-hand integral diverges, but we already saw that earlier. 

The situation is even clearer when one looks at a picture: 


y = /㈤ 
y = g(x) 

1 a b 

In this picture, the area under y = g(x) between x = a and x = bis supposed 
to be infinite. The curve y = f(x) sits above y = g(x), so the area under it 
(between x = a and x = b) should be even greater. More than infinite is still 
infinite, so f(x) dx also diverges. 

What if g(x) dx diverges but f(x) < g(x) instead? What can you say 
about / a 6 f{x) dx? The answer is: diddly-squat. Bubkes. Nothing at all. Let’s 
see how the math would go: 

f f(x) dx < 

J a 

So the integral we are interested in, f(x) dx, is less than or equal to infinity. 
That is, either it is less than infinity, so it converges, or it is equal to infinity, 


/ g(x) dx = oo. 
J a 



f 


/ g(x) dx = oo. 
J a 
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so it diverges. Great — we now know that it either converges or diverges. 
Whoop-di-doo. Yup, we haven’t accomplished anything. So don’t do this. 

On the other hand, for convergence, it is the other way around. Here, if 
we want to know about f(x) dx and we know that g(x) dx converges, 
we’d better hope that f(x) < g{x). You might say that we want / to be 
“controlled” by 夕 . Well, then we’d get convergence (still assuming that both 
functions are positive). So, if 0 < f(x) < g(x) on (a ， b) and g(x) dx con¬ 
verges, then so does /: f(x) dx. Mathematically, 

/X 雜 <00 ， 

so both integrals converge (noting that the left-hand integral is positive, so it 
can’t diverge down to — 00 ). The picture looks like this: 



The shaded area under y = g(x) between x = a and x = b is assumed to be 
finite. You can clearly see from the picture that the area we want, which is 
under y = f(x) between x = a and x = b, is less than the finite shaded area. 
Since the area we want is positive and less than a finite number, it must also 
be finite. 

Beware: suppose you know that g(x) dx converges, but you have the 
inequality f(x) > g(x) instead. Now the curve you want (y = f(x)) sits above 
the other curve (y = g{x)). This is no good: you’d only be able to say that 


f(x) dx > 


g(x) dx. 



So the integral we are interested in on the left-hand side is greater than 
or equal to some finite number. Our integral is therefore finite or infinite. 
Great — no info whatsoever. Hey, we’re in the whoop-di-doo case again! So 
don’t go there. 

It’s true that I haven’t really justified the comparison test, mathematically 
speaking. Actually there’s not that much to it. Some splitting of integrals 
(and hairs) is required, but we’ve already seen the basic idea. For example, if 
f and g both have a vertical asymptote at x = a and have no blow-up points 
anywhere else, and 0 < f(x) < g(x) for all x in the interval [a, 6], then we can 















say that 
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0< / f(x)dx< f g(pc) dx 
J a-\-€ J a+e 

for any £： > 0. Now take limits. If the improper integral J: g(x) dx converges, 
then the right-hand side becomes finite. Everything now depends on the 
middle integral. Since f(x) is always positive, the middle integral gets bigger 
as £ tends toward 0 from above. It’s getting bigger and bigger, but it can’t 
get past the barrier at g(x) dx, which is a nice finite number. The only 
possibility is that the middle integral converges to some finite number as 
e — 0+.* In other words, f(x) dx converges. That proves the comparison 
test in its convergence version (the second of the two versions we looked at 
above), in the special case where / and g only have problems oi x = a. It is 
now up to you to prove the divergence version and also to work out how to 
deal with problems dX x = b. There’s really not much difference. Of course, 
if the problems are in the middle somewhere, or there are multiple problems, 
you have to split the integral into pieces before using the comparison test 
anyway. 

We’ll look at many examples involving the comparison test in the next 
chapter. Now it’s time to look at another test. 


20.4 The Limit Comparison "fesf (Theory) 

The comparison test uses the improper integral of one function to get infor¬ 
mation about an improper integral of another function. The limit comparison 
test does the same thing, except that we don’t actually need one function to 
be bigger than the other. Instead, we need the two functions to be just about 
the same. Here’s the basic idea: suppose that two functions / and g are very 
close to each other at the blow-up point x = a (and have no other blow-up 
points). Then f(x) dx and g(x) dx either both diverge or both converge. 
Their behavior is identical. Intuitively, it makes sense; let’s get down to de¬ 
tails by specifying what we really mean when we say that two functions are 
“very close” to each other. 

20.4.1 FutiOtions o^mptetiG to o^h©r 

Suppose we have two functions / and g such that 


This means that when x is near a, the ratio f{x)/g(x) is close to 1. If the 
ratio were equal to 1, then f(x) would equal g(x). Since the ratio is only close 
to 1, then f(x) is “very close” to g(x). This doesn’t mean that the difference 
between f(pc) and g(x) is small! For example, f(x) could be a trillion and g(x) 


* Actually, this statement, which seems obvious, is quite profound. The statement is 
pretty much what distinguishes R from any of its proper subsets which contain every rational 
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could be a trillion plus a million (for the same value of x); in that case, the 
ratio f(x)/g(x) would be a little under 1, while the difference between f(x) 
and g(x) is still a million! On the other hand, the two numbers are relatively 
close to each other, since a million is a small difference relative to the size of 
the numbers. 

So, we’ll say that /($) 〜 g(x) as a; — a if the limit of the ratio is 1. That 
is, 


f(x) ~ g(x) a,s x ^ a means the same thing as 


lim 


fix) 

9{x) 


This doesn’t mean that f(x) is approximately equal to g(x) when x is near 
a: it means that the ratio of f{x) to g(pc) is near 1 when x is near a. We say 
that / and g are asymptotic to each other a,s x ^ a. Of course, you could 
replace a: ^ a by x ^ oo, or even x a + ; all you have to do is make the 
same replacement in the limit too. 

All this is useless unless we have limits of the form 



lim 琪 =1. 

x^a g(x) 


Actually, we’ve seen many of these types of limits! Here are some examples:* 

.. 3a: 3 — lOOOx 2 + 5a: — 7 . .. sin(a;) ^ 

lim - —^ -= 1 ， lim -= 1 ， 

x—^oo 3x 6 x—>0 x 

y e x -l ln(l + x) 

lim -= 1 ， and lim - = 1. 

x —x x 


The first limit above can be written as 3a; 3 — 1000a; 2 + 5x — 7 ~ 3a; 3 as $ — oo. 
That is, 3a; 3 — 1000x 2 + 5x—7 and 3a: 3 are asymptotic to each other as 丨 — oo. 
Similarly, the second limit says that sin(a:) 〜: c as a: — 0. The third and fourth 
limits show that e x — 1 and ln(l + x) are also both asymptotic to as a: — 0; 
that is, — 1 〜 a; and ln(l + a:) 〜 a; as x — 0. 

All we’ve done is to rewrite each limit in a different form, but it is a 
very convenient form. Indeed, you can take powers of asymptotic relations 
and get new ones. For example, knowing that sin(x) 〜 x as a: — 0, we can 
immediately write that sin 3 (a:) 〜 a: 3 as a: 一 0, or even that 1/ sin(a:) 〜 1 /a: as 
rr — 0. You can also replace x by any other quantity that goes to 0 as $ does, 
such as a power of x. For example, starting with sin(x) 〜 a: as a: — 0 once 
again, we can replace x by 4a: 7 to see that sin(4a: 7 ) 〜 4a; 7 as a: — 0. You can 
even multiply or divide two relations by each other, provided that the limit is 
at the same value of x for both asymptotic relations. For example, we know 
that tan(o:) 〜： r as a: — 0 since 



So we can multiply tan (a?) 〜 a: and sin(x) ~ x (both as a: — 0) together to 
get the asymptotic relation tan(a:) sin (: c) 〜 x 2 as $ — 0. 


*The examples can be found in Section 4.3 of Chapter 4, Section 7.1.1 of Chapter 7, 
and Sections 9.4.2 and Sections 9.4.3 of Chapter 9, respectively. 
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What you cannot do is add or subtract these relations. For example, if 
you start with tan(a:) ~ x and sin(a:) 〜； r as a: — 0 , you can’t just subtract 
the second relation from the first to get tan(a:) — sin(ar) 〜 x — x. Indeed, x — x 
is just 0, and nothing can be asymptotic to 0. Why not? Well, if f(x) ~ 0 as 


x ^ a, then we’d need 


lim® 


That’s clearly garbage, since the left-hand side doesn’t make any sense. So, 
by all means, multiply, divide, and take powers of asymptotic relations, but 
don’t add or subtract them. 


20.4.2 Ihe 被 獄 nehtof, 



OK, so we have this notion of two functions being asymptotic to each other, 
and we have some examples too (like sin (a:) 〜 a: as a: — 0). So what? Well, 
suppose you have some function / with a problem spot only at a, and you’re 
trying to see if the improper integral f(x) dx converges or diverges. If you 
can find a function g which behaves like / when the argument x is near a, 
then you can just replace f by g and see if g(x) dx converges or diverges. 
Whatever you find for g also holds for /. 

More formally, if /($) 〜 g{x) as a: — a, and neither function has any 
problem spots anywhere else on the interval [a, 6 ], then the integrals f(x) dx 
and g(x) dx both diverge or both converge. (If they both converge, then 
the values they converge to may be different.) This is one case of the limit 
comparison test. Here’s a sneak preview of its power; we’ll see many more 
examples in the next chapter. Suppose we want to know whether 

Jo sm(v^) 


converges or diverges. It seems difficult to find an antiderivative of 1/ sin(y / x). 
Luckily, we don’t have to. Since sin(a:) 〜 a: as a: — 0, we can replace the small 
quantity x by another small quantity \fx to see that sin(y ^) 〜 y/x as $ —> 0+. 
(We need to use : c — 0+ because y/x only makes sense when x > 0.) Taking 
reciprocals, we have 


sin(v^) 


y/x 


as a; ^ 0 + . 


Also note that 1/ sin(v^) and 1/y/x have no blow-up points in (0,1]. So, the 
limit comparison test says that the two integrals 

f 1 1 f 1 1 

/ . / dx and / — dx 

Jo sm(v^) Jo V x 

either both converge or both diverge. We have replaced a difficult integral 
with a much easier one, 1/y/xdx. We already know from Section 20.1.1 
above that this easier integral converges, so we can immediately conclude that 
the integral we want (on the left) also converges. 
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Of course, there are cases of the test which apply when the blow-up point 
is at 6 , or when the region of integration is unbounded. We’ll list all the 
versions in Section 21.2 of the next chapter. In the meantime, let’s see why 
the test works in the above case. Since f(x) ~ g(x) as a: — a, we know that 



In particular, provided we get close enough to a, the ratio f(x)/g(x) must be 
at least \ and no more than 2. That is, we can pick some c between a and b 
such that 

^ < 2 for all a: in (a, c]. 

This inequality can be rewritten as 

S f(x) < 2g(x) for all x in (a, c]. 

Now we can use the comparison test. For example, if g(x) dx diverges, then 
so does g(x) dx (as we’ve seen above). In fact, so does \ g(pc) dx, infor¬ 
mally since one-half of infinity is still infinity! So, the fact that f(x) is greater 
than \g{x) means that the integral f(x) dx diverges, and it follows that 
f(x) dx diverges too. On the other hand, if g(x) dx actually converges, 
then so does 2 f^g(x)dx and we can again use the comparison test (you can 
fill in the details) to show that f(x) dx converges as well. 

A quick comment: most textbooks have a different statement of the limit 
comparison test. In particular, the limit of f{x)/g(x) doesn’t actually have 
to be 1 — it could be any positive number and the above argument would still 
work (after a slight modification). On the other hand, allowing a limit other 
than 1 doesn’t really gain anything, and it loses the ability to use the intuitive 
〜 notation. As we’ll see in the next chapter, we’ll get by very nicely with our 
version of the test. 

i0.5 The p-test (Theory) 

Now that we have the comparison test and limit comparison test, we need to 
know how to use them. Our basic strategy, which will be greatly elaborated 
upon in the next chapter, will be to pick a function g which we can compare our 
function / with. Hopefully g is simple enough that we can at least say whether 
its integral (over the region under consideration) converges or diverges. 

The question is, what are some functions we could choose as gl Well, 
the most useful are the functions l/x p for some p > 0. For example, we 
have already looked at some integrals involving 1/a:, l/^/x^ and 1/a: 2 , which 
correspond to p = 1, and 2, respectively. Since these functions are so easy 
to integrate, we can use the limit formulas to get the p-test: 
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• (p-test, f°° version) For any finite a > 0, the integral 



dx 


converges if p > 1 and diverges if p < 1. 

• (p-test, J 0 version) For any finite a > 0, the integral 



converges if p < 1 and diverges if p > 1. 

Notice that the two versions of the test are basically opposites: except for 
when p = 1 , one of the integrals 

f a 1 1 

/ — dx and / — dx 

Jo 妙 Ja 妙 

converges and the other one diverges. The case p = 1 corresponds to 1/x, and 
as we already know, both of the integrals diverge in this case. 

Now, this p-test is really useful and comes up often in practice, so it’s 
really important that you don’t get the two versions of the test mixed up! 
One way to remember the correct version of the test is to remember what 
happens with 1/x 2 and 1/y/x. I just remember the two little facts: 



dx converges, and so does 


v^ dx ' 


From these two facts, I can remember the whole of the p-test! How does it 
work? Well, from the first fact, and the knowledge that what goes on near oo 
is opposite from what goes on near 0, I know that 





dx 


diverges. Similarly, from the second fact, I know that 



— dx 

y/X 


also diverges. What about other exponents? Well, any exponent higher than 
1 (for example, 暑， 2, or 70) behaves in the same manner as 1 /x 2 , and any 
exponent lower than 1 (for example, |, or 0.999) behaves exactly like l/y/x 
(remember, this is the same as 1 /x 1 ^ 2 ). 

It might also help to examine the following diagram: 
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20.6 


◎ 


exists, then the whole integral 1 /x p dx converges. If on the other hand 

the limit doesn’t exist, the integral diverges. So, write the above limit as 

lim —— . 

N—oo NP- 1 

If p > 1, then p—1 > 0, so N p_1 gets very large as N gets large. Its reciprocal 
becomes very small, and so the limit is therefore 0 and our original integral 
converges. On the other hand, if p < 1, then 1 — p > 0, so N 1_p gets very 
large and the limit blows up to oo, proving that the original integral diverges. 
This proves one half of the p-test. The proof of the other half is almost the 
same; you just have to use e —> 0 + instead of AT ^ oo. I’ll leave the details 
to you. 

Tf 翁 Absolute Convergence 丁顿 

One of the assumptions in the comparison test is that the functions / and g are 
always nonnegative. What if you want to investigate the behavior of a function 
which is sometimes negative? Well, if the function is always negative, you 
could just pull out a minus sign and reduce it to the case of a positive function. 
We’ll see an example of this in the next chapter. On the other hand, if the 
function keeps oscillating between positive and negative values throughout 
the region of integration, you can appeal to the absolute convergence test. 
Here’s what it says: 


if 


[\f(x)\dx converges, then so does 

J a 


/ f(x) dx. 


This also works on infinite regions of integration (such as [a, oo) instead of 
[a, b)). Watch out: if the absolute-value version of the original integral di- 
verges，then the original integral could still converge! Such examples are 
pretty cool but they’re beyond the scope of this book. On the other hand, 
we’ll see something similar when we look at alternating series in Section 23.7 
of Chapter 23. 

Why is the above test useful? Well, for one thing, \f(x)\ is always non¬ 
negative, so you can use the comparison test on improper integrals involving 
it. For example, consider the improper integral 


sin ⑻ 


dx. 


The integrand sin(a:)/a: 2 oscillates between positive and negative values as x 
gets larger and larger without bound. So we can’t use the comparison test* 
or the limit comparison test yet. Let’s try the absolute convergence test first. 


* Direct comparison won’t work, since the integrals sin(x)/x 2 dx are not increasing 
in iV as iV gets bigger. The idea of the argument at the end of Section 20.3 above fails 
since it depends on the integrals getting bigger and bigger without bumping into the ceiling 
provided by the integral of g. 
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We need to consider this integral instead: 

rm 


dx. 


This can be rewritten as 


|sin(a;)| 


dx 


since x 2 can’t be negative. Now we can use the comparison test. You see, 
|sin(a:)| < 1 for all a:, so it follows that 


|sin(a;)| 


X 2 ~ X 2 

for all x. The comparison test says that 
广 |sin(x)| 


X 2 


dx < 


- dx. 


◎ 


Since the right-hand integral converges by the p-test, so does the left-hand 
integral. Finally, we can use the absolute convergence test to say that 

乂 dx converges, so 乂 also converges. 

It’s a little subtle, but you really do need to use those absolute values. 

Here’s another example: 

I cos ⑷ dx. 

Jo 

The integrand cos(x) oscillates between positive and negative values, so maybe 
we should look at the absolute value “version” of the integral: 

/ |cos(a:)| dx. 

Jo 

Unfortunately, there’s not a hope in hell that this new integral converges. 
To see why, draw the graph oi y = |cos(a:)| right now and you’ll see that 
it’s a series of identical humps, one after the other. There’s no way you can 
add up the areas of infinitely many identical humps and get a finite value. 
So the absolute value version diverges. This means that we cannot use the 
absolute convergence test! The only time that this test can be used is when 
the absolute value version of the integral converges. 

We have learned nothing from these shenanigans: we are back to square 
one. We don’t know whether our original integral converges or diverges. So, 
let’s try using the definition of the improper integral with problem spot at oo: 


cos(a:) dx = lim / cos(a:) dx = lim sin(a:) 
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This last limit doesn’t exist, since sin(iV) keeps on oscillating between —1 and 
1， never making up its mind even as N becomes larger and larger without 
bound. So our original integral / 0 °° cos ⑷ diverges, not because it goes to oo 
or —oo, but because it oscillates too much. 

Oscillating integrals like this are extremely tricky to deal with. If you’re 
lucky, you can use the formal definition as we did above. Most of the time this 
doesn’t work. Many mathematicians have spent a whole lot of time trying to 
understand what’s going on. For the moment, just bear the above example in 
mind. We’ll have more than enough to deal with in the next chapter, where we 
return to the tests and see how to solve problems involving improper integrals. 

Before we do this, let’s take a quick look at why the absolute convergence 
test works. Suppose we know that 

[\f(x)\ dx 

J a 

converges. Now comes a nice trick: set g(x) = \f{x)\ + f(x) for all x in [a, b] 
where / is defined. Then g has two important properties: first ， g(x) > 0, and 
second, g{x) < 2\f(pc)\. (In both cases, x is any number in [a, b) which is in 
the domain of /•) In fact, if you think about it, you can see that g(x) actually 
equals 2 f(x) whenever f(x) > 0, and that g(x) actually equals 0 whenever 
f(x) < 0. Try to show that the two important properties follow from this. 
Anyway, we can now use the comparison test on g: 

Q < j g(x) dx <2 J \f(x)\dx < oo. 


The conclusion is that 



converges as well. So what? Well, notice that f(x) = g(x) — \f(x)\, so 


f(x) dx 


g{x) dx • 


|/(a:)| dx. 


Both integrals on the right converge — the first because we just showed it, and 
the second because we are assuming it — so the left-hand side converges as 
well. 











CHAPTER 21 


Improper Integrals: Howto Solve Problems 


Let’s get practical and look at a lot of examples of improper integrals. As 
we go along, we’ll summarize the main methods. In the previous chapter, 
we introduced some tests that will turn out to be really useful. To use them 
effectively, you have to understand how some common functions behave, espe¬ 
cially near 0 and near oo. By “common functions,” I mean our usual suspects: 
polynomials, trig functions, exponentials, and logarithms. So, here’s the game 
plan for this chapter: 

• what to do when you first see an improper integral, including how to deal 
with multiple problem spots and functions which aren’t always positive; 

• summary of the comparison test, limit comparison test, and p-test; 

• the behavior of common functions near oo and —oo; 

• the behavior of common functions near 0; and 

• how to handle problem spots at finite values other than 0. 


21 1 Howto Get Started 


◎ 


OK, so you have an improper integral, f(x) dx. (We’ll always assume that 
/is continuous or has finitely many discontinuities.) You know that your 
integral is improper because the integrand / has at least one problem spot in 
[a, b ]. Problem spots occur at blow-up points of /, like vertical asymptotes, 
and also at oo and —oo, if applicable. For example, the integral 



has problem spots at oo and —oo (since these are always problem spots if they 
are involved), and also at a: = 1 and x = —1 (since the integrand is undefined 
there). 

As we said in Section 20.1.2 of the previous chapter, it makes sense to 
concentrate on one problem spot at a time. Also, we’d like to arrange matters 
so that the integrand is always positive, at least when x is near the problem 

























summary, if there are no problem spots, the integr 
converges! So, for example, 


ln(rr + 1) 
a; 4 + x 2 + ] 


dx 


converges since the integrand is bounded on the bounded r( 
is, there are no problem spots. Don’t get suckered into using any fancy tests 
in an example like this. 


How to deal with negative function values 

If f(x) takes on negative values for some x in [a, 6], which often happens when 
trig functions or logs are present, you need to take special care. Luckily you 
can often reduce matters to integrals with only positive integrands. Here are 
three ways to deal with negative function values: 

1. If the integrand f(x) is both positive and negative as x ranges over [a, b ], 
you should consider trying the absolute convergence test. As we saw in 
Section 20.6 of the previous chapter, this says that 



This test is particularly useful for investigating improper integrals in¬ 
volving trig functions when the region of integration is unbounded. The 
example 


f°° sin(a:) 

乂 7 


dx 


from Section 20.6 of the previous chapter is of this type. Recall that the 
way to start is to consider the absolute version of the integral, namely, 



x 2 


you don’t need absolute values around the denominator since it’s always 
positive. Then show that this new integral converges (see page 448 for 
the details) and conclude that the original integral converges as well, by 
the absolute convergence test. In general, don’t forget this important 
point: the absolute convergence test only helps you show that an integral 
converges. That is, you cannot use the absolute convergence test 
to show that an integral diverges! 

2. Suppose that the integrand f(x) is always negative (or zero) on [a, b]. 
That is, f(x) < 0 on [a, 6]. If this is true, you can write 


f(x) dx = — (—f(x)) dx. 


So what? Well, —f{x) is now always nonnegative, so you can use the 
comparison test or the p-test to see whether dx converges 
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or diverges. Of course, if this integral converges, so does f(x) dx, 
and similarly if /^(—/(a:)) dx diverges, so does dx. Here’s an 

example: consider 

Jo ^ ln ⑷ 

There’s certainly a problem spot at a; = 0. The thing to realize is that 
ln(a:) is actually negative for x between 0 and 1, so it’s a good idea to 
start out by writing 


r 丄 dx __ri dx 

Jo x 2 ln(x) ax ~ J 0 xHn(x) ^ 

Actually, since ln(a:) is the negative part here, you can replace — ln(a:) 
by I In (a:) I as follows: 


'1/2 


Jo ln ⑷ 

Now we can just worry about 


dx : 


-L 


x 2 |ln(x)| 


dx. 



a; 2 1 In ⑷ I & 


Unfortunately you’ll have to wait until page 474 to see that this last 
integral diverges. The conclusion will then be that the original integral 
diverges as well. Note that the absolute convergence test doesn’t work in 
this case, since that test can only be used to show an improper integral 
converges. 

3. If neither of the previous two cases seems to apply, you may be able to 
use the formal definition of the improper integral to see what’s going on. 
An example of this is 

/ cos ⑷ dx, 

Jo 

which we looked at on page 448. 

This is not the end of the story. There are slightly freaky improper integrals 
which converge, but which are not absolutely convergent.* These sorts of im¬ 
proper integrals seem to come up quite often in actual physics and engineering 
applications, but they are beyond the scope of this book. So, it’s time to go 
back and review the integral tests. 


21.2 Summary of Inf^pral ■敏 

The most valuable tools you have at your disposal are the comparison test, 
the limit comparison test, and the p-test. We looked at these tests from a 


*For example, sin(x)/xdx converges but J^lsin^l/cc dx diverges. Kudos to you if 
you can work out why either of these assertions are true. 
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theoretical point of view in the previous chapter; here are the statements 
once again, for reference. In all the tests below, the integrand f(x) is 
assumed to be positive on the region of integration. 

• Comparison test, divergence version: if you think that f(x) dx 
diverges, find a smaller function whose integral also diverges. That is, 
find a nonnegative function g such that f(x) > g(x) on (a, 6), and such 
that g(x) dx diverges. Then 



so J: f(x) dx diverges. 

• Comparison test, convergence version: if you think that f(x) dx 
converges, find a larger function whose integral also converges. That is, 
find a function g such that f(x) < g(x) for all x in (a, 6), and such that 
J: g(x) dx converges. Then 


[f(x) dx < 
J a 


/ g(x) dx < oo, 


so la f( X ) dx als0 converges. 

Beware of the whoop-di-doo case! This was discussed in Section 20.3 of the 
previous chapter, and arises if you get the above inequalities the wrong way 
around. The comparison test just doesn’t work if you screw up the direction 
of the inequalities. 

As an alternative to the comparison test, there is the limit comparison 
test. This is useful when you can find a function which behaves just like the 
integrand near the problem spot. In Section 20.4.1 in the previous chapter, 
we made the following definition: 


f(x^) 

/($) 〜 9i x ) as a: 一 a means the same thing as lim = 1. 


The definition also applies if you replace both instances of a: — a by a: — oo 
(or x —oo). In any case, if your integrand / is really nasty and you can 
find a nicer function g such that /($) 〜 g(x) as x approaches the problem 
spot, you’re in business! That’s because the limit comparison test says that 
whatever goes for g also goes for /. More precisely, here are two versions of 
the test depending on whether the problem spot is infinite or finite: 


Limit comparison test, oo version: find a simpler nonnegative func¬ 
tion g with no problem spots in [a, oo), such that /($) 〜 g(x) as x —> oo. 


Then 


—if f: g(pc) dx converges, so does / a °° f(x) dx\ whereas 
—if g(x) dx diverges, so does f(x) dx. 

Of course, you can change the region [a, oo) into (—oo,6] and everything still 
works. There’s also a version which applies when the problem spot is at some 
finite value a, which is at the left endpoint of the region of integration: 









456 • Improper Integrals: How to Solve Problems 


• Limit comparison test, finite version: find a simpler nonnegative 
function g with no problem spots on (a, b] so that /($) 〜 g{x) a,s x ^ a. 
Then 

一 ^ la 9{x)dx converges, so does f(x) dx; whereas 
— if g(x) dx diverges, so does f f(x) dx. 

Needless to say, this is also true if the only problem spot is at the right 
endpoint x = b instead of x = a, provided that f(x) ~ g(x) as a: — 6 (not a). 

So it’s up to us to pluck an appropriate function g out of thin air to use 
as a comparison. It turns out that a lot of problems can be solved simply 
by taking g(x) to be equal to l/x p for some appropriately chosen p. The 
convergence or divergence of the integral of such a function is precisely stated 
by the p-test: 

• p-test, f°° version: for any finite a > 0, the integral 



— dx converges if p > 1 and diverges if p < 1. 


• p-test, f Q version: for any finite a > 0, the integral 


r a i 

/ — dx converges if p < 1 and diverges if p > 1. 

Jo xP 


Learn all these tests well — they are your friends. 


tl.3 Behavior of Common Functions ； near oo and —oo 

OK, it’s now time to answer the most important question of them all: how 
do you choose the comparison function gl This depends on whether the 
problem spot is at 士 oo, 0, or some other finite value, so we’ll consider these 
cases separately. In almost all the cases we’ll look at, we are just restating 
limits and inequalities that we’ve seen earlier, then applying these principles 
to investigating improper integrals. Now let’s start by looking at how common 
functions behave near oo or —oo. 

21.3.1 Polynomials and poly-type functions near oo and — oo 

As far as polynomials are concerned, the highest power dominates as 
a: ^ oo or a: —oo. More precisely, suppose that p is a polynomial; then it’s 
true that 

if the highest-degree term oi p(x) is ax n ^ then 
p(x) 〜 ax n as x oo or as a: ^ —oo. 


For example, we have 


x 5 H- 4x 4 + 1 〜： r 5 


as a: ^ oo. 
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Don’t take my word for it: you can check this by showing that the ratio of 
the quantities x 5 +4$ 4 +1 and x 5 has limit 1 as a: —> oo. Here’s how it works: 


lim ^ + 4^ + l =1 ， m 




We also discussed the above principle in Section 4.3 of Chapter 4. 

If p is a poly-type function instead of a polynomial, a similar principle 
applies. (See Section 4.4 in Chapter 4 if you want to learn more about poly¬ 
type functions.) For example, to understand the behavior of 3y/x-2^x-\-4 ： as 
x —> oo, write it as 3a: 1 / 2 — 2a: 1 / 3 + 4; then since the highest power is 1/2, we 
can say that 3y/x — 2 拆 + 4 〜 3y/x as a: ^ oo. (That’s not true as x ^ —oo, 
since you can’t take the square root of a negative number!) 

Sometimes the highest power isn’t easily identifiable. Here’s an example: 
\/r 4 + Sx 3 — 9 — a: 2 is a poly-type function of x which seems to have highest 
power 4, but of course you have to take the square root — which knocks the 
power down to 2. By the time you cancel out the x 2 terms, the highest power 
is pretty weird. We’ll see how to deal with a problem like this at the end of 
this section. 

Since we have many new asymptotic relations, we can use the limit com¬ 
parison test to analyze a lot of improper integrals. For example, consider 



2 + 20y/E dX 


and 



x 5 H- 4a; 4 + 1 


dx. 


In both cases, oo is the only problem spot. Let’s look at the first integral. The 
denominator 2 + 20y/x may be written as 2 + 20a: 1 / 2 ; here 1/2 is the highest 
power. So it’s true that 2 + 20★ 〜 20a: 1 / 2 as a: — oo, and it follows that 


2 + 2(Vi 〜 20a: 1 / 2 


as $ — oo. 


Now, the integral 



diverges by the p-test, so by the limit comparison test, the integral 

r i dx 

Jl 2 + 

also diverges. As for the second integral above, since x 5 + 4 工 4 + 1 〜工 5 as 


x ^ oo, the same is true for the reciprocals: 

1 1 

—^ - --： — - 〜 ^ as x ^ oo. 

x 5 + 4a: 4 + 1 x 5 

Now, we have to be careful! We’d like to say that the integral we want behaves 
exactly like the integral J 0 °° 1/x 5 dx\ the difficulty here is that this integral 
now has an extra problem spot at 丨 = 0. In fact, this integral diverges, 
but only because of the problem spot at 0. This would lead to the wrong 
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answer altogether. In order to avoid these inanities, we should have started 
by splitting the original integral into the pieces 


x 5 + 4a; 4 


- dx 


and 


x 5 + 4a; 4 - 


The first of these integrals converges because there are no problem spots. As 
for the second, we have 


x 5 + 4a: 4 + 1 x 5 

since 1/x 5 dx converges, so does the integral 


◎ 


x h + 4$ 4 + 


■ dx. 


Both pieces converge, so our original integral converges too. Beware of this 
situation —— it arises often, so make sure that you split up the integral. Basi¬ 
cally, if the “limit comparison function” 夕 has a problem spot that the original 
function doesn’t, you have to split up the original integral to avoid introduc¬ 
ing a new problem spot. Normally the new integrand g(x) will be of the form 
l/x p , so you just need to avoid x = 0 when you have a problem spot at oo, 
just as in our example. 

Here’s another example: let’s investigate 


3a: 5 + 2a: 2 + 9 


x 6 4 - 22a; 4 + + 18a: 


: dx. 


This is a little more complicated. The only problem spot is at oo. The numer¬ 
ator of the integrand is easy to handle: 3x 5 + 2a: 2 + 9 〜 3a: 5 as $ — oo. As for 
the denominator, first note that V4x 13 + 18x 〜 V4x 13 = 2x 13 ^ 2 as x oo. 
Since 13/2 is greater than 6, the \/4x 13 + 18x term actually dominates the 
rest of the denominator, x 6 + 22a: 4 , so the whole denominator is asymptotic 
to 2a: 13 / 2 as a: — oo. Putting this all together, we get 

3a: 5 + 2/+ 9 3x 5 3 1 

X e + 22^ + V4^tm^^ = 2^ aS ^°°- 

Since the p-test shows that the integral 


3 

2 






converges, so does our original integral, by the limit comparison test. 
Finally, consider 



Vx 4 H- Sx 3 — 9 — x 2 


dx. 


As we discussed above, the highest power in the denominator is difficult to pin 
down, since and —x 2 cancel out. So, we have to multiply top and bottom 
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by the conjugate expression of the denominator. (We’ve used this trick many 
times before; see Section 4.2 in Chapter 4 for some examples.) We get: 



Vx 4 + 8x 3 — 9 — x 2 


V a: 4 + 8x s — 9 — x 2 


\/x 4 + 8x 3 — 9 H- x 2 
Vx 4 8x 3 — 9 + x 2 X， 


I leave it to you to simplify this to 


f°° y/x A H- 8a; 3 — 9 x 2 . 

/ --- ax. 

J 9 8x 3 -9 


The denominator is easy to handle: 8x s — 9 〜 8x 3 as x ^ oo. How about 
the numerator? Well, x 4 + 8a: 3 — 9 〜 a; 4 , so y/x 4 H- 8a; 3 — 9r 2 , and finally 
+ 8x 3 — 9 + a: 2 〜 2a; 2 (all as a; —> oo). The last statement was a lit¬ 
tle tricky, since you’re not allowed to add or subtract asymptotic relations. 
To justify the statement, we need to show that the ratio of the quantities 
V^ 4 + 8a: 3 — 9 + x 2 and 2a: 2 goes to 1 as ar — oo. Here’s how: 

v y/oc 4 + 8a: 3 — 9 + x 2 .. 1 (Vx 4 H- 8a: 3 — 9 x 2 \ 

lim -- = lim - - ^ - h . 

x—^oo ZX z x—^cx) 2 \ X z X z I 

Now drag the x 2 on the denominator into the square root (as x 4 ) and simplify 
to see that the above limit is 


'x 4 + Sx s — 9 


x 4 


8 __9_ 

X X 4 


This proves that Vx 4 -8x s — 9 + :r 2 〜2工 2 as a: — oo. Now we can return to 
our original integrand and write 

1 Vx 4 + 8# - 9 + a: 2 2a: 2 1 

— , - = ---〜——-=—— as x ―> oo. 

Vx 4 + 8x 3 - 9 - x 2 Sx s - 9 8怎 3 4x 

Let’s use the limit comparison test; since / 9 °° l/4xdx diverges, so does the 
original integral. By the way, would you have guessed that the original inte¬ 
grand is asymptotic to 1/4$ as a: — oo? It’s not so easy to see … so if you 
want to use the fact that the highest power dominates, make sure you have 
one and only one clear highest power! 


21.3.2 Trig functions neor oo 翁 ㈣ oo 

Perhaps the only really useful thing we can say here is that 
I |sin(A)| < 1~| and | |cos(A)| < l] 


for any real number A. It’s not much, but it’s better than nothing. (The other 
trig functions have too many vertical asymptotes, so they don’t satisfy similar 
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◎ 


inequalities.) There are two main applications of the above inequalities. One 
is that you can use the comparison test in many cases. For example, does the 
integral 

irm- 

converge or diverge? Well, let’s start by using |sin(a: 4 )| < 1. Note that it 
doesn’t matter that we are taking the sine of x 4 instead of A — the sine (or 
cosine) of anything is no more than 1 in absolute value. So, we have 


|sin(x 4 )| 
\fx + x 2 


dx < 


\fx-\-X 1 


dx. 


Great — we got rid of all the trig in the expression. The only problem spot in 
the right-hand integral is at oo. Since the highest power dominates for large 
we have y/x + a: 2 〜 a: 2 as a; — oo. Now take reciprocals to see that 


\/^ + < 


By the p-test, we know that / 5 °° 1/x 2 dx converges, so the limit comparison 
test tells us that 


y/x + X 2 


dx 


also converges. Finally, we see that 
广 |sin(_ 


y/x~\~X 2 


dx < 


y/x + X 2 


dx < oo, 


so our original integral converges by the comparison test. 

The other nice application of the facts that |sin(A)| < 1 and |cos(A)| < 1 is 
that you can treat the sine or cosine of anything as inconsequential compared 
to any positive power of x, at least as a: ^ oo or a: —> —oo. For example, 

2a: 3 — 3a: 0 ' 1 + sin(100:r 200 ) 〜 2a: 3 as x ^ oo. 

Why? Because the sine term is laughably small compared to 2x 3 when a; is a 
large number. To be more precise, we have 


2a: 3 -3a: 01 H-sin(100a; 200 ) 

2^ 


3 

2a: 2 - 9 


sin(100x 200 )、 
2^3 t 


The term 3/2x 2 ' 9 goes to 0 as a: — oo; the main point is that you can i 
sandwich principle to show that 

“ S in(100^)_ n 


the 


X—KX> 2x^ 

I’ll leave the details to you, because we looked at similar examples way back 
in Section 7.1.3 of Chapter 7. In any case, we have shown that 

v 2a: 3 -3a: 0 - 1 +sin(100x 200 ), 

lim -- = 1- 

x—^oo ZX° 
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This proves that 

2a: 3 — 3a: 0-1 + sin(100a: 200 ) 〜 2a: 3 as a: ^ oo 

after all. This would be useful if you want to understand whether or not the 
following integral converges: 



2a: 3 - 3a: 01 + sin(100a; 200 ) 


By the limit comparison test and the above asymptotic relation, the integral 
behaves the same as / 8 °° l/2x s dx does. Since this last integral converges by 
the p-test, so does our original integral above. 


21.3.3- Exponentiate nfear oo and — oo 

Here’s a really useful principle: exponentials grow faster than polyno¬ 
mials. We first saw this in Section 9.4.4 of Chapter 9. There we expressed 
the principle in the form 

lim — = 0, 

x—^oo e 

where n is any positive number, even a very large one. Now consider the 
function / defined by f(x) = x n /e x . We know that /(0) = 0; also, the above 
limit says that f(x) ^ 0 as or ^ oo. So how large could f(x) possibly be 
when $ > 0? It starts at 0, has no vertical asymptotes, and goes back down 
to have a horizontal asymptote at y = 0. There must be some maximum 
height that the graph of y = f(x) gets to. Let’s call it C; this means that 
f(x) = x n /e x < C for all x > 0. (Note that you get a different C for each 
n, but that doesn’t really affect us at all.) Now, writing l/e x as e~ x and 
dividing both sides by x n , we get the useful inequality 



for all a: > 0. 



As we noted in Section 9.4.4 of Chapter 9, the same is true if you replace 
e~ x by e~ p ^ x \ where p(x) is any polynomial-type expression that goes to 
infinity when a: ^ oo, and also if the base e is replaced by any other number 
greater than 1. For example, the same inequality is true if e~ x is replaced by 
2_5x 5 +/a^+3. The important point is that you get to choose any n you like, 
and you often have to be careful that you make it large enough. For example, 


consider 



x s e~ x dx. 


The good news is that the integrand is positive and there are no problem 
spots except for oo. The bad news is that the x s factor grows quickly as 
x — oo. However, the e~ x factor decays (to 0) very fast and actually beats 
the a: 3 factor to a pulp. To see this, we’ll notice that 


e _a; < 


C_ 

x 5 
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This is just the above boxed inequality, with n chosen to be 5. Why 5? 
Because it works: 


We have used the p-test to show that C f^° 1/x 2 dx converges. The compari¬ 
son test now shows that the original integral converges as well. Now, how did 
I know to use a: 5 ? What would happen if I used, say, e~ x < C/x 4 instead? It 
doesn’t work: 


We are firmly in whoop-di-doo territory here, since we have just shown that 
the original integral is either finite or infinite, that is, we have shown absolutely 


nothing. On the other hand, if we’d used x 4 . 0001 , it would have worked. Why? 
Convince yourself that the exponent you choose can be any number greater 
than 4, and the argument still works. In practice, it’s good to choose a number 
2 more than the power you are trying to kill. Here we wanted to kill a; 3 , so 
we used e~ x < C/x b . 

An important point: it is wrong, wrong, wrong to write x s e~ x ~ e~ x as 
a: ^ oo. It simply isn’t true! If it were, then you could cancel out the positive 
quantity e~ x to conclude that a; 3 〜 1 as a: — oo, and this is just crazy talk. 
So you should use the comparison test, not the limit comparison test, in the 
previous example. 

Now look at this integral: 


Here we need to do a bit of work. The integrand looks as if it might be 
oscillating between positive and negative values because of the sin(a:) term, 
but that’s not true because sin(x) isn’t big enough to affect the positivity of 
^,1000 _|_ x 2 when x > 10. In any case, the first observation is that we have 
尤 looo + 尤 2 + s in(a ： ) 〜 x 1000 as a: ^ oo, since the x 2 and sin(a:) terms get their 
butts kicked by the a: 1000 term. (See the previous section if you want to learn 
how to provide a slightly more technical explanation!) So we can multiply by 
e 一 2:2+6 to see that 


Using the limit comparison test, we only need to know whether 


converges or diverges; our original integral will do the same thing. Now we 
have to be careful, since the exponential term e _x2+6 doesn’t obey a useful 
asymptotic rule. We have to use basic comparison here. You see, x 1000 really 
grows，but e~ x2+6 really really really decays. Lefs use 


-x 2 +6 














behavior of e near 



3 we used the minus 
oration around. I lea 、 


pretty self-evident, but you can check it using the formal definition or ev 
the p-test with p = 0.) In any case, the comparison test now shows that t 
original integral diverges. 

Let’s also consider what happens when you add an exponential and 
polynomial. As you might expect, if the exponential becomes large, then 
dominates the polynomial. For example, to analyze 
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first take a look at the denominator e x — 5a: 20 . The e x term should dominate 
the 5$ 20 term, so we should have e x — 5a: 20 〜 e 31 as 丨 — oo. We can prove 
this by looking at the limit of the ratio: 



(Here we used the limit from the very beginning of this section.) Anyway, 



We’d better work out what happens to the denominator 7 X — 4 X . Here both 
terms 7 X and 4 X are exponentials, but the one with the highest base should 
dominate. That is, 7 X — 4 X 〜 7 X as a; —> oo. To see why, look at the limit of 
the ratio: 



converges, so our original integral also converges by the limit comparison test. 





Section 21.3.4: Logarithms near oo • 465 


21.3.4 Logarithms near oc 

First, notice that we don’t consider logarithms near —oo, because you can’t 
take the log of a negative number! So it’s futile to ask what happens to ln(a:) 
as a; —> —oo. 

On the other hand, logs grow slowly at oo. In fact, they grow more 
slowly than any positive power of x. In symbols, we can say that if a > 0 is 
some positive number of your choosing, then no matter how small it is, we 
have 

幽 =0. 

X a 

We looked at this principle in some detail in Section 9.4.5 of Chapter 9. By a 
similar argument to the one we used at the beginning of Section 21.3.3 above, 
you can show that there must a constant C such that 


lim 


◎ 


I ln(o;) < Cx a for all x > 1.1 

The same is true for logs of any base greater than 1, or if ln(rr) is replaced by 
the log of a polynomial with positive leading coefficient. 

For example, what do you make of 


Without the ln(a:) term, it would converge by the p-test. The idea is that 
the \n(x) term barely affects anything since it grows really slowly. That’s 
pretty waffly, although it is definitely the right conceptual idea. To nail this 
question, you have to use ln(a:) < where a is so small that the x a term 
doesn’t destroy a nice property that the number 1.001 has: it is bigger than 1. 
For example, if we try ln(x) < Cx 0 - 5 , we get 


ln(a;) 


dx < 


Cx 0 - 5 

$1.001 


dx = C 


$0.501 


dx = oo 


by the p-test. Yep, it’s whoop-di-doo all over again. The integral we want is 
less than or equal to oo, which says nothing. Let’s be more subtle and use 
\n(x) < Cx 0 0005 . Now 0.0005 is a very small number — so small that when 
you subtract it from 1.001, you get a number which is still bigger than 1. 
Let’s see how it works: 


ln(a:) 


dx < 


Cx 0A 


■dx = C 


■ dx < oo. 


◎ 


The convergence of the right-hand integral above follows from the p-test，since 
1.0005 is greater than 1. Now we know that the left-hand integral converges 
by the comparison test. You see how subtle it is? The methodology is very 
similar to how we handled exponentials in Section 21.3.3 above. 

Mind you, the principle that logs grow slowly isn’t useful in every improper 
integral involving logs. Here are six improper integrals to consider: 


In ⑷ 


dx, 


ln(a:) 


dx, 


ln(a:) 


dx, 


x ln(a:) 


dx, 


ln(a:) 

x 0.999 


dx, and 


x 0 999 ln(a:) 
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We just looked at the first one and found that it converges. Now look at the 
second example: 



x 1001 ln(x) 


dx. 


Here, the integral would still converge without the \n(x) factor, but this factor 
actually helps when it’s on the bottom! That is, when you throw the ln(x) 
into the denominator, you are making the denominator larger than it was 
before, which makes the whole integrand smaller. This helps the integral to 
converge. How do you write this down effectively? You need to express the 
idea that In ⑷ is bounded from below when x gets large. In this case, 
the region of integration is [2, oo). So how small can ln(a:) possibly be on this 
region? Since In ⑷ is increasing in x, we find that ln(a:) is smallest on the 
region [2, oo) when x = 2. So all we need to write is ln(x) > ln(2) when x>2. 
How does that help? Take reciprocals to find that 



M2) 


when x > 2. Now divide through by a: 1001 to get our integrand on the left- 
hand side: 

1 〆 1 
x 1 - 001 ln(a;) - a; 1001 ln(2) • 

The comparison test now saves the day, since 

I x^\n{x) dX - j 2 x^W) dX = W)L ^ dX< °°- 



Remember, ln(2) is a constant, so it can be pulled out of the integral, and the 
integral converges by the p-test since 1.001 is bigger than 1. So the second 
of the above six integrals converges. By the way, the precise number ln(2) 
is irrelevant — we could have just replaced ln(2) by some positive constant C 
without worrying about what C actually is, and the proof would still have 
been correct. 

How about the third of our above integrals? Look at 

x 


What happens if you take out the ln(x) factor from the numerator? We know 
that / 2 °° 1/xdx diverges. Putting the ln(a:) back in the numerator just makes 
this worse. So the above integral should diverge. To nail this, let’s use the 
inequality ln(x) > ln(2) for a: > 2 once more (or if you prefer, you could 
replace In(2) by some constant C > 0). We get 



ln(x) 


dx > 


ln(2) 


dx = ln(2) / — dx = oo. 


By the comparison test, our integral diverges. 
As for the fourth integral, 


xln(x) 


dx, 
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here you have to do something completely different. You see, everything is 
very finely balanced. Without the ln(x) factor, the integral would diverge. 
Since the ln(a:) factor is in the denominator, it helps the integral to have a 
chance to converge. Does it help it enough? We’d like to use ln(a:) < Cx a ^ but 
no matter how small you make a, you’ll never get a comparison that works. 
(Try it and see!) Instead, let’s use a change of variables. Let t = ln(a:), so 
that dt = 1/x dx. When a; = 2, we see that t = ln(2), and as a: ^ oo, also 


t ^ oo. So 



x ln(a:) 


dx : 



dt 


where the last integral diverges by the p-test. So our original integral diverges. 
On the other hand, let’s change the upper endpoint of the above integral from 
oo to e e ' like this: 



The number e e§ is actually really big. My computer says that it’s approxi¬ 
mately 4x 10 1294 ， which means 4 followed by 1,294 zeroes. This is an unbeliev¬ 
ably huge number, which is essentially infinite so far as our poor human brains 
can comprehend. Since the integral diverges if the upper endpoint is actually 
oo, you’d think that the value of the above integral should be enormous. So 
let’s work it out. Using t = ln(x) once again, we get 



x\n(x) 


dx : 



-dt = ln(t) 


= ln(e 8 ) - ln(ln(2)) = 8 - ln(ln(2)). 


Here we have used the fact that when x = e e8 , we have t = ln(e e8 ) = e 8 . In 
any case, the final answer is a little under 8. That isn’t large at all! This 
might make you think that our improper integral 



x ln(a:) 


converges, but as we just saw, it actually diverges, 
really slowly. 


Now let’s consider 





It just diverges really 


If you use the substitution t = ln(a:) once again, you get 


1 」 dt 

L 咖 ⑻) 1 . 1 x _ 人 ⑺ p 

where this last integral now converges by the p-test. So the new integral 
converges. Just throwing in a tiny extra power of ln(x) on the bottom, namely 
(In ⑷ ) 01 ，is enough to cause convergence. That’s pretty whacked out. 

We still have two more integrals to look at. The first is 
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Behavior of Common Functions near ◦ 

We now know all about how polynomials, trig functions, exponentials, and 
logarithms behave at infinity. Now let’s see what happens to them near zero. 

Fal^roiliiQls and poly-type functions nearO 

For polynomials, the lowest power dominates as a: — 0. This is the 
opposite of what happens as x ^ oo! To be more precise, suppose that p is a 
polynomial; then it’s true that 

if the lowest-degree term of p(x) is bx 771 ， then p($) 〜 bx m as o: — 0. 


For example, 5a: 4 — a: 3 + 2x : 
that the limit of the ratio is 


2a; 2 as x —> 0. Let’s check this by showing 


= 13( 竽 


For poly-type functions, it，s not always easy to find the lowest-degree term, 
but the general principle still holds water. So, for example, x 2 + yfi 〜 y/x 
as a: — 0+, since y/x = x 1 ^ 2 and 1/2 is smaller than 2. (By the way, it’s as 
: c — 0+ because you can’t take the square root of a negative number.) The 
principle even works if constants are present — they are really multiples of 
which is a very low-degree term! So, for example, 2a: 1 / 3 + 4 〜 4 as a: —> 0, as 
4x° has a lower exponent than 2a: 1 / 3 . 

Let’s look at some examples of improper integrals. First, consider 


The only problem spot is at x 


Now we know that 


as 

J Q 5 1/y/xdx converges (by the p-test), so does 


(by the limit comparison test). So our integral converges, and it’s all because 
of the >Jx term. Without it, we’d only have 1/a: 2 , and the integral of this over 
[0,5] diverges. So the \fx term saves the day. But wait! At this point I want 
you to look back at page 460 and see how we saw that the integral 


also converges. What’s important in this last integral is the x 2 term, not the 
y/x term. Without the x 2 term, this last integral would diverge. So the full 
integral we looked at right at the beginning of Section 21.1.1 above, 
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converges because both the following pieces converge: 

r 5 i r°° i 

: dx and 


/o X 2 -\-y/x 


X 2 + y/x 


◎ 


The problem spot at 0 is OK because of the y/x term and the problem spot 
at oo is OK because of the x 2 term. Nice, huh? 

How about this one: 

f 1 x+ 3 


+ 2' 5 


dx? 


Well, the problem spot is again at x = 0. Now a: + 3 〜 3 and a: + 〜 a: as 

a: — 0, so 

x-\-3 3 ^ 

- r- 〜一 as $ — 0. 

x-\- x 

The improper integral f 0 3/xdx diverges by the p-test; the limit comparison 
test now shows that our original integral 

Jo X-\-X^ 


diverges as well. 


l|.4.i ； 'Trig functions near Q 

Here are some very useful facts: 


sin(a:) 〜: r, tan (: r) 〜$， and cos ⑷ 〜 1 as $ — 0. 


These are just restatements of limits we’ve already looked at in Chapter 7: 


lim 

x—^0 




and lim cos (a;) = 1. 



(If the cosine limit bothers you, write cos (a:) as cos(x)/l to see that cos (a:) 〜 1 
as a: — 0 after all.) Beware: these asymptotic relations only work with 
products and quotients, not sums and differences. For instance, you cannot 
write sin(a:) — a: 〜 0 as a: — 0; see the end of Section 20.4.1 in the previous 
chapter for a more thorough discussion of this. 

Let’s look at some examples. Consider 


tan(a:) 


dx and 


tan(x) 


dx. 


O These look pretty similar, but appearances can be deceptive. We’re going to 
I use tan(a;) 〜： r (as a: — 0) for both integrals. Fll let you fill in the details, but 
here’s the basic idea: for the first integral, use 1/ tan(x) 〜 1/a: (as a: — 0) 
and the limit comparison test to see that the integral diverges. On the other 
hand, to do the second integral, use 1/ /tan ⑷ 〜 l/W (as x 0 + ) and the 
limit comparison test to see that this integral converges. 
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◎ 


Here’s another example: how about 
f 1 sin(a:) 

Jo 


dx? 


Without the sin(a;) factor, we don’t have a hope of convergence, since 3/2 is 
greater than 1 and the integral would diverge by the p-test. But the sin(a:) 
factor saves the day: 


sin(o:) 


x _ 1 

^72 = ^V2 


as a; ^ 0 + . 



Since f 0 1/x 1 ^ 2 dx converges, the limit comparison test shows that our original 
integral converges. What’s interesting about this example is that the integral 



also converges, but for completely different reasons. Here the problem spot is 
at 00 , and we have to use an absolute integral instead. A direct comparison 
of the absolute integral gives 




丄 cte < po, 


◎ 


so our integral converges (we have used the p-test, the comparison test, and 
the absolute convergence test). Note that the power 3/2 is good at 00 (1/2 
would be bad!) and that this time the sine function didn’t help (or hurt, for 
that matter). Incidentally, we have now shown that 

r 备 



converges — can you see why? 

A word of warning: just because we’re looking at the behavior as a: — 0 
doesn’t mean that the problem spot has to be at 0. It might even be at 00 , 
as the following example shows: 



dx. 


Here the problem spot is at 00 , but 1/x becomes very small as a: ^ 00 . So in 
the relation sin(a:) 〜 a: as a: — 0, replace x by 1/x to see that sin(l/a:) 〜 1/x 
as 1/x 0. Of course, as x ^ 00 , we know that 1/x 0, so we have shown 

that 


( 3)4 


Now you can use the limit comparison test to say that the above integral 
diverges, since 1/xdx diverges. 
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21.4.3 Exponentials near 0 

In some sense, exponentials have no effect at 0. More precisely, 




and 




This is just another - 


◎ 


that 

and 


lim( 

0：—0 


For example, the improper integral 


e x 


/o x cos(a:) 


diverges, because 


e x 


xcos(x) 


= — as x 


(You get to fill in the rest of the details.) Beware: this only applies to the 
exponential of a small quantity (like x or —x). An example of a tricky integral 
where you could trip up is 

l e ~^~ dx - 

It would be wrong to write e 一 1 /$ 〜 1, since 1/x — > oo as x —> 0 + . We should 
really use the techniques from Section 21.3.3 above. In particular, there we 
saw that 


)—large stuff 


c 


(same large stuff) ^ 


for any n. If the large stuff is 1/a; (remember, x is small and positive so 1/a: 
is large), then this becomes 


( 1 » 


Cx n 


for any n. Now I leave it to you to see that any choice of n which is greater 
than 4 will work. For example, taking n = 5, you get 


r 1 P -i/^ r 1 Cr b 

L ~^ dx -l ~^ dx=c . 


1 da: < oo, 


◎ 


where the last integral obviously converges because there are no problem spots 
(in fact, the integral is just 1). That was a pretty tough question, by the way. 
Here’s another possible trap. In the integral 

f 2 dx 




you might be tempted to use the relation e x 〜1 as a: — 0 to try and write 
e x — 1 〜 0 as a: — 0. This last relation can’t be true, since you’re not allowed 
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to divide by 0. We need to be cleverer. In Section 20.4.1 of the previous 
chapter, we used the classic limit 


lim 

a:—0 


e x - 


from Section 9.4.2 of Chapter 9 to conclude that 


I — 1 〜 a: as a: — 0.1 

It follows that 

1 1 

, 〜 —7= as a: — O' 

Ve x - 1 y/x 

Now the limit comparison test shows that the original integral converges. 


.21.4 4 l^Qdrithfiri qe0| 0 

Here the principle is that logs go to — oo slowly as a: — 0+. Let’s make 
things go to oo instead by taking absolute values, remembering that In ⑷ is 
negative when 0 < x < 1. So the idea is that no matter how small a > 0 is, 
there’s some constant C such that 


c 


II^WI < ^ 

for all 0 < a: < 1. 


This follows from the limit 


lim x a In ⑷ = 0, 

x-j-0+ 


◎ 

we use a new variety of the same trick that we’ve used several times before. 
Without the |ln(a:)| term, the integral would converge. We need a power so 
small so that when you add it to 0.9 you are still below the critical power 
1. Let’s try a = 0.05. The above boxed inequality now says that we have 
|ln(a:)| < C/a: 0 - 05 , so 


which we looked at in Section 9.4.6 of Chapter 9 (except we used a instead 
of a). The argument is very similar to the one we used at the beginning of 
Section 21.3.3 above. 

So, to understand 

f |ln ㈨ I 

L 


dx, 



|ln(a:)| ^ C/x om _ C _ C 

x 0.9 — ^0.9~ = ^0.9^0.05 = ^0.95 • 

Now you can use the comparison test and p-test to finish off the problem and 
show that the above integral converges. I want you to convince yourself that 
if we picked a to be anything greater than or equal to 0.1, we’d be in the 
whoop-di-doo case. By the way, we have now automatically seen that 


/ 11 In ⑷ 

Jo x 0 . 9 


dx 


converges, since it’s just the negative of the original integral. 
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◎ 


For another example, consider 



x 2 \\n(x)\ ^ X ' 


If the |ln(a:)| factor weren’t there, this would diverge by the p-test. The |ln(a;)| 
tries to help the integral to converge, but it can’t help very much, since it’s 
only a log, and logarithms grow slowly. So we still expect the integral to 
diverge. To get the math right, note that since |ln(x)| < C/x a , we can take 
reciprocals to see that l/|ln(a:)| > x a /C. Once again we have to choose a to 
be small enough so that we avoid the whoop-di-doo case. We have 

1 

x 2 \hi(x)\ Cx 2 J 

so we will be OK as long as a < 1. (Why?) In fact, with a = 1， the right-hand 
side becomes 1/Cx, and you can proceed from here to see that the integral 
diverges. Note that the integral 



x 2 ln(a:) 


also diverges (to — oo) since it is the negative of the original integral. 
One final example: how about 



x°- 9 |ln(a:)| 


dxl 


Now the integral converges without the |ln(a:)| factor, but throwing this large 
quantity into the denominator just helps the integral converge faster. So 
we just need to find the minimum of |ln ⑷ | on (0,1/2]; think about it and 
convince yourself that the minimum occurs when x = 1/2， and so whenever 
0 < a: < 1/2, we have |ln(a:)| > |ln(l/2)| = ln(2). Finally, take reciprocals and 
divide by x 0 9 to get 

1 1 
x 0 9 |ln ⑷ I $ x 0 - 9 ln(2) 

for all 0 < a; < 1/2. Now you just need to apply the comparison test and the 
p-test to see that the original integral converges. 


21.4.5 The behavior of more general functions near 0 

In Section 24.2.2 of Chapter 24, we’ll learn about Maclaurin series. If you 
haven’t seen this yet, don’t worry about it! Make a note to come back and 
read this section after you’ve learned all about Maclaurin series. Anyway, the 
basic idea is that if a function has a Maclaurin series which converges to the 
function near 0, then the function is asymptotic to the lowest-order term in 
the series as — 0. That is, 


if f(x) = a n x n + a n+ ia: n+1 + …， then f(x) 〜 a n x n as a: — 0. 
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Consider the following examples: 


We know that cos(ar) 〜 1 as a: — 0, but that doesn’t tell us a thing about 
1 — cos(a:). One way to deal with this quantity is to use the Maclaurin series 
for cos(a:): 

x 4 

COS(>) = 1 - ^7 + -77 - ； 


this can be rearranged to 


So, by the above principle, the lowest-degree term on the right-hand side 
dominates and we can write 


By the way, this agrees with an example in Section 7.1.2 of Chapter 7 where 
we showed that 

1 - cos(a;) — 1 


In any case, I leave it to you as an exercise to use the above asymptotic relation 
to show that the first of our above integrals diverges whereas the second one 


How to Deal with Problem Spots Not at 0 or oc 

If a problem spot occurs at some finite value other than 0, do a substitution. 

Specifically: 

• If the only problem spot in J: f(x) dx occurs at a: = a, make the sub¬ 
stitution t = x — a. Note that dt = dx. The new integral has a problem 
spot at 0 only. 

• If the only problem spot in f(x) dx occurs at a: = 6, make the substi¬ 
tution t = b — x. Note that dt = —dx. Use the minus sign to switch the 
limits of integration. The new integral should have a problem spot at 0 
only. 

For example, on page 436, we looked at 


Jo x(x- l)(a: +l)(a:-2) • 

We split this into five integrals, each with only one problem spot, and claimed 
that they all diverge. One such piece (we called it 1^) is 


x(x — l)(x + l)(x — 2) 
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Here the problem spot is at x = 2, so let’s substitute t = x—2. Since x = t+2, 
the integral becomes 


Jo (亡 + 2)(t + l)(t + 3) 亡 

The bounds of integration are now 0 and 1, and the problem spot has been 
shifted over to 0. Now we can use the fact that the lowest-degree term in any 
polynomial dominates near 0 to write 

t + 2 〜 2 ， f + 1 〜 1, and t + 3 〜 3 8.S t ~> 0. 

We can combine these facts to see that 


(t + 2)0 + l)(t + 3> 2x lx3xt 6^ 

The limit comparison test and p-test now show that the above integral di¬ 


verges. 

Another piece of the original integral (we called it I 4 ) is 


x(x — l)(x + l)(a; — 2) 


Now the problem spot is at x = 2, which is the right-hand limit of integration. 
So substitute t = 2 — x. When x = 3/2, we see that t = 1/2, and when x = 2, 
t = 0. Since dt = —dx and x = 2 — t, we have 




x(x — l)(x H- 1)($ — 2) 




亡 )(1 _ 亡 )(3 — 


(2 — t)(l — t)(3 — 


dt. 


In this last integral, we have used the minus sign from the equation dx = —dt 
in order to switch the limits of integration (as described in Section 16.3 of 
Chapter 16). Anyway, it’s not too hard to see that 


(2 — t)(l — t)(3 — 


as 亡 — 0, 



so the above integral diverges (again by the limit comparison test and p-test — 
you get to fill in the details, taking care to handle the negative integrand 
correctly). In fact, you should now try to show that the other three integrals 
(ii, I 2 , and Is on page 436) diverge. 










CHAPTER 22 


Sequences and Series: Basic Concepts 

Here’s the good news: infinite series are pretty similar to improper integrals. 
So a lot, but not all, of the relevant techniques are shared and we don’t 
need to reinvent the wheel. In order to define what an infinite series is, we’ll 
also need to look at sequences. Just as in the case of improper integrals, 
I’m devoting two chapters to sequences and series: this first chapter covers 
general principles, while the next one is more practical and contains methods 
for solving problems. If you’re reading this for the first time, go ahead and 
check out the details of this chapter. For review, a quick glance over the main 
points should suffice before moving on to the examples in the next chapter. 
Here are the topics for this chapter: 

• convergence and divergence of sequences; 

• two important sequences; 

• the connection between limits of sequences and limits of functions; 

• convergence and divergence of series, and howto handle geometric series; 

• the nth term test for series; 

• the connection between series and improper integrals; and 

• an introduction to the ratio test, root test, integral test, and alternating 
series test. 

Again, this chapter is mostly theoretical! If it’s examples you want, most 
are in the next chapter. 

22.1 Convergen'c© and Divergence of Sequences 

A sequence is a collection of numbers in order. It might have a finite number of 
terms, or it might go on forever, in which case it is called an infinite sequence. 
For example, 

0, 1 ，一 1,2, —2,3, _ 3, • • • 

is an infinite sequence which incidentally includes every integer, positive and 
negative. Sequences are normally written using subscript notation, where ai 
denotes the first element of the series, the second, as the third, and so on. 
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(Sometimes ao is the first element, a\ the second, and so on. Also, we don’t 
have to use a; for example, b n or any other letter is fair game.) So in the 
above example, a\ = 0, a 2 = 1, as = —1, = 2, and so on. Often a sequence 

is given by a formula, such as 

a n = ^ 

for n = 1, 2, ... This defines the sequence 

sin(l) sin(2) sin(3) sin(4) 

^*** 

Given an infinite sequence, our main focus is going to be on the limiting 
behavior of the values of the sequence as the index n tends to infinity. That 
is, what happens to the sequence as you look farther and farther along it? In 
math notation, does 

lim a n 

exist, and if so, what is it? By the way, we haven’t really defined the above 
limit, but the definition is not much different from the definition ofjmi o f(x) 
for a function /. (See Section A.3.3 of Appendix A for the actual definition.) 
The basic idea is that the statement 


lim a n = L 

n—^oo 

means that a n might wander around for a little while, but eventually gets 
very close — as close as you like — to L and stays at least as close to L for ever 
after. If there’s such a number L, then the sequence {a n } converges' otherwise 
it diverges. Just like functions, sequences can diverge to oo or —oo, or they 
can oscillate around (possibly crazily) and not get close to any particular 
value. For example, the above sequence 0, 1, —1,2, — 2,… diverges; it does 
not diverge to oo or —oo, but instead oscillates between positive and negative 
numbers of bigger and bigger absolute value. 

By the way, as we did with functions, we sometimes say that a n — L as 
n — oo. This means the same thing as saying ^lirr^an = L. 


鷄 1.1 The connectisn between sequences Q_i functioni 

Consider the sequence given by 


= 


sin(n) 


which we looked at earlier. This is closely related to the function / defined 
by 


In fact, a n is equal to f(n) for each positive integer n. So if we can establish 


that f (x) exists, then we’ll know that the sequence {a n } has the same 

limit. The sequence inherits the limiting properties of the function. There’s 


also a connection to horizontal asymptotes: remember that = L, 

then the graph of y = f(x) has a horizontal asymptote at y = L. 






sin(n) 


as n — oo; 


since the cosine function is continuous at 0, we can hit both sides with cosine 
to get 


( sin(n) ) 


cos(O) = 1 as n ^ oo. 


One more useful tool that we can borrow from the theory of functions is 
FHopitaPs Rule. (See Section 14.1 in Chapter 14.) The problem with using 
the rule on a sequence is that you can’t differentiate the quantity a n with 
respect to the variable n, since n has to be an integer. Indeed, when you 
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differentiate a function / with respect to a variable x, the idea is that you 
wobble x around a little and see what happens to f{x). You can’t wobble an 
integer around because it wouldn’t be an integer any more. So, if you want 
to use PHopitaPs Rule, you have to embed the sequence in a suitable function 
first. For example, if a n = ln(n) / \/n, you can find^in^an by letting 


f{x)= 


ln(a:) 


and then finding by using PHopitaPs Rule. Note that this is an 

oo/oo case, so you can use the rule here. Differentiate the top and bottom 
separately to get 


lim 學坚 lim 矣二 lim 4 

X—OO yjx X-^OO l/2y/X x^oo yjx 


Since the function limit is 0, the sequence a n also converges to 0 as n — oo. 
(We could also have used the fact that logs grow slowly at oo to find the above 
limit; just apply the formula at the beginning of Section 21.3.4 in the previous 
chapter with a = 1/2.) 


22.1.2 Two im|30ptGtnts^c|t}0d#pi 

Pick some constant number r and consider the sequence given by a n = r n 
starting at n = 0. This is a geometric progression. Notice that each term 
is a constant multiple of the previous one. Let’s look at a few examples of 
geometric progressions: 

• if r = 0, the sequence is just 0,0,0,, which clearly converges to 0; 

• if r = 1, the sequence is just 1,1,1,..., which clearly converges to 1; 

• if r = 2, the sequence is 1,2,4,8,, which evidently diverges to oo; 

• if r = —1，the sequence is 1, —1,1, —1, 1， ...，which diverges, but not to 
oo or —oo, because it keeps on oscillating back and forth between —1 
and 1 — in other words, the limit does not exist (DNE); 

• if r = —2, the sequence is 1, —2,4, —8,..., which diverges in the same 
way (the limit does not exist) —— in fact, this time the oscillations are even 
wilder; 

• if r = 1/2, the sequence is 1,1/2,1/4, 1/8,…， which converges to 0; and 
finally, 

• if r = —1/2, the sequence is 1, —1/2, 1/4, 一 1/8, …， which also converges 
to 0, despite the oscillations, since these oscillations eventually become 
as small as you like. 

These are all special cases of the general rule, which is as follows: 


f=0 

if - 1 < r < 1, 

-1 

if r = 1 ， 

lim r n l 

n—^oo I = 00 

ifr > 1, 

[dne 

if r < -1. 
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Here’s how to justify the above limit. First, when r > 0, the limit fol¬ 
lows from the similar limit involving r x that we looked at in Section 9.4.4 of 
Chapter 9 (see the middle box). The tricky case occurs when r < 0, since the 
resulting sequence oscillates. To deal with the oscillations, notice that 

_| r |n< r n<| r |n 

for all n. The nice thing about this is that the sequences {—|r| n } and {|r| n } 
aren’t oscillating. In fact, if —1 < r < 0, then \r\ < 1, so we already know 
that both of the sequences converge to 0; now we can just use the sandwich 
principle to see that r n —> 0 as well. Finally, if r < —1，then r n cannot 
possibly converge, since it keeps flipping between positive numbers greater 
than equal to 1 and negative numbers less than or equal to —1. The resulting 
limit does not exist (DNE) due to these oscillations. (The situation here is 
similar to the limitwhich we looked at in Section 3.4 of Chapter 3; 
also check out Section A.3.4 of Appendix A.) 

Geometric progessions don’t have to start at 1. If we set a n = ar n , where 
a is some constant, then the first term ao is equal to a. You can find^limjar 71 
by multiplying the values of lim r n in the box above by a. Most important, 
if —1 < r < 1, then^limar 71 is 0 regardless of the value of a. 

Having spent a lot of time on geometric progressions, let’s look at the limit 
of another sequence very quickly. In particular, if k is any constant, then 

l + ^j = eK 

This follows directly from the limit at the beginning of Section 9.2.3 in Chap¬ 
ter 9. It’s really useful to know this limit in the context of sequences, however. 

22.2 Convergent© and Divergence of Series 

A series is just a sum. We’d like to add up all of the terms of a sequence a n . 
So, instead of putting commas between the elements, you put plus signs. If 
the sequence is infinite, things get a little hairy — after all, what does it even 
mean to add up infinitely many numbers? For example, if the sequence a n is 
the geometric progression 1,1/2,1/4,1/8, •…, then the corresponding series is 

1 + 1/2+1/4+1/8H - . We need to do something clever to handle the dots 

at the end, which indicate that the series goes on forever. 

In general, we’d like to understand what 

ai + 奶 + 奶 + ... 

means. To deal with this infinite sum, let’s chop it off after some large number 
of terms. We’ll call the number of terms AT, so the chopped-off series looks 
like this: 

ai + < i2 + 勿 + ... + cln—i + cin- 

This is just a sum of finitely many quantities, so it makes sense. Now, here’s 
what we’d like to say: 

cii + 叱 + 奶 + ... = lim (ai + 勿 + 勿 + ... + ajv—i + ajv)- 
N—oo 
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The right-hand side looks a little weird, since the number of terms is changing 
as N gets larger. So let’s define a new sequence, which we’ll call {Ajv}, by 
setting 

An = ai + ci2 + CI3 + … + aN-i + 

This new sequence is called the sequence of partial sums. The weird equation 
now looks like this: 

01 + a2 + a3 H - = lim A N . 

N^oo 

Now the right-hand side isn’t so weird —— it’s just the limit of a sequence. If 
the limit exists and equals L, then we’ll say that the series on the left-hand 
side converges to L. If the limit doesn’t exist, then the series diverges. 

Here’s a nice analogy to understand all this stuff. I want you to imagine 
that you’re standing at a rest stop on a long, straight highway which extends 
in both directions — the way you’re going and the way you’ve just come from. 
The rest stop is at position 0. (We’ve seen this old highway before, for example 
in Section 5.2.2 of Chapter 5.) Unfortunately you have lost all your free will, 
and some guy with a megaphone is commanding you every minute to walk a 
certain number of feet. You can only move when he says so. If he calls out 
a negative number, you actually walk backward. Each time you move, we’ll 
call it a step. (Hopefully the guy won’t ask you to move 100 feet in a single 
step!) 

The first number that megaphone man calls out is ai, so you move from 
position 0 to position a 1 (the units are in feet, but I won’t say that every 
time). The next number is a 2 , so you walk forward feet. Where does that 
put you? At position ai + a 2 , since you started at a\. After the third number 
he calls out, which is of course as, you’ll be at position ai + a 2 + a^. The 
pattern should be pretty clear: after N steps of sizes ai, a 2 , as, and so on up 
to ajv, you will be at position 

ai + <12 + 奶 + … + ctN—i + ajsr. 

This is exactly the value of the partial sum An which we defined above! In 
other words, An is your position after you take N steps. So, when we write 

a 2 as-\ - = lim A N , 

N—^oo 

we’re saying that you can add up all the steps, provided that you eventually 
start homing in on a particular point on the highway. You have to get really 
really close to that point, never straying far away from it. You’ll be making 
tiny little steps, tiptoeing around this point. Otherwise, there’s no hope of 
adding up all the steps and the series will diverge. 

Now it’s time to bust out some sigma notation. (We looked at this in 
Section 15.1 in Chapter 15.) The formula for An becomes 

N 

An = ai + a2 + 03 + • • • + ajv-i + cln — cl u . 

n=l 

The infinite series is written as 


ai + a 2 + <13 + ... = 〉: a n . 
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So, here’s how to define the value of an infinite series using sigma notation: 


oo N 

E ci n = lim 〉: (in' 

N—^oo 

n=l n=l 


If the limit on the right-hand side doesn’t exist, then the series on the left-hand 
side diverges. Remember, the right-hand side is really the limit of a sequence, 
so the above equation isn’t as obvious as the notation makes it appear to be. 

Let’s just review the scenario once more before we move on. You begin 
with an infinite sequence 

{^n} = Q-l, ^2? ^3? • • • 


and use it to construct an infinite series: 

〉: = ai + 奶 + 奶 + ... • 
n=l 

To understand the limiting behavior of this series, make a new sequence of 
partial sums: 


An = ^ a n = a\-\- a2- 


• 


+ djv—i + dN. 


By definition, the limit of the series is the same as the limit of the new 
sequence of partial sums, if the limit exists; otherwise the series diverges. 
Since there are two sequences and one series floating around here, make sure 
you understand what’s what! 

By the way, we don’t need to begin our series at n = 1. You can begin at 
any number, even n = 0. All you have to do is change the starting term in 
the partial sums and everything works out. Now, here’s an important point: 
whether a series converges or diverges has nothing to do with the starting 
point of the series! For example, we’ll see in Section 22.4.3 below that the 




diverges. This immediately tells us that all the following series diverge as well: 




vi. 


and even 


oo 1 


n=1000000 


To see why the first of these series diverges, just break out the first four terms 
of the original series, like this: 


f 1 1111^1 25 1 

^n = l + 2 + 3 + 4 + ^n = 12 + ^n- 


So the series starting at n = 1 and the series starting at n = 5 differ only 
by the finite constant 25/12. Since the series starting at n = 1 diverges to 
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22 . 2.1 


oo, subtracting 25/12 isn’t going to affect this at all. The series starting at 
n = 5 must also diverge. Of course, there’s nothing special about 5: the same 
argument works for any starting point. Similarly, we’ll see in Section 22.4.3 
below that 

oo 1 

actually converges. This means that all of the following series automatically 
converge as well: 



E 




and even 


E 




See if you can prove this by splitting up the original sum. 

One more thing before we go on to geometric series: consider the series 

oo - 

n=0 

We’ve just changed the starting point to n = 0, but now something annoying 
happens: the first term is I/O 2 , which doesn’t exist. So the above series is 
whacked out. It’s not that it diverges; it just doesn’t make sense, since the 
first term isn’t defined. We’ll always try to avoid this situation by starting 
at a large enough value of n so that all the terms of the series are actually 
defined. 


Geometric series (十 heo 「 y|:. 

Let’s look at an important example of an infinite series. Suppose we start with 
the geometric progression 1, r, r 2 , r 3 , .. •, which we looked at in Section 22.1.2 
above. We can use this sequence as the terms of an infinite series: 

oo 

l + r + r 2 + r 3 + …二 r n . 

n=0 

This is called a geometric series. The question is, does it converge, and if so, 
to what? 

To find out, we’d better look at the partial sums. Pick a number N\ then 
the partial sum An is given by 

Ajv = 1 + r + r 2 + r 3 H - l-r^ -1 +r N . 

In sigma notation, we have 

N 

A n = ^2r n . 

n=0 

Hopefully, in your previous math studies you’ve seen that the above expression 
can be simplified as follows: 

i _ W+i 

An = l + r + r 2 + r 3 -| - h r N ~ x -\-r N = --- 
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as long as r ^ 1. (In any case, there’s a proof of this formula at the bottom 
of this page.) Now we need to take the limit of An as AT —> oo. 

First, suppose that —1 < r < 1. Then we saw in the first box of Sec- 
0, so replace N by AT+1 to get lim = 0 

N^oo 

1 _ 1 


tion 22.1.2 above that lim r p 
as well. So 、 


Our geometric series converges to 1/(1 — r). Here’s how the whole argument 
looks on one line, using sigma notation: 


n=0 n=0 


How about when r isn’t between —1 and 1? It turns out that the geometric 
series must diverge in this case; we’ll see why at the end of the next section. 
So, in summary: 



In the above geometric series, the first term is always 1, since r° = 1. If 
you start at some other number a instead, then the terms are a, ar, ar 2 , and 
so on. So you can multiply everything by a to get a more general form of the 
above principle: 


ar n = i a if — 1 < r < 1; 


n=0 

otherwise, if r > 1 or r < —] 


the series diverges. 


We’ll see plenty of examples of how to deal with geometric series in Sec¬ 
tion 23.1 of the next chapter. Meanwhile, I promised that I’d prove the 
formula 

N 1 r iV+l 

1 — r 

n=0 

from above. Here’s how: first, multiply the sum on the left by (1 — r) to get 

N 

A N (l - r) = (1 - r) r n . 


Now pull the factor of (1 — r) through the sum and simplify to see that 


N N 

A n (1 -r) = ^r"(l-r) = ^(r™ - r n+1 ). 

n=0 n=0 

The right-hand sum is a telescoping series —— see Section 15.1.2 in Chapter 15 
for a review of this — so the sum works out to be r° — r^ 1 , or 1 — r N+1 . So 
Aiv(l — r) = 1 — r N+1 ] now all you have to do to get our formula is to divide 
by (1 — r), which is nonzero since we assumed that r ^ 1. 





486 • Sequences and Series: Basic Concepts 


22.3 The nth Term Test (Theory) 


For a series to converge, the sequence of partial sums has to have a limit. 
Remember that the partial sum after N steps represents your position after 
you have taken N steps according to the megaphone dude’s orders. (See Sec¬ 
tion 22.2 above if you don’t have any idea what I’m talking about.) Anyway, 
if your position is going to converge to some special limiting position as you 
keep on taking more and more steps, then your steps have to become really 
really small. Otherwise you’ll blunder about and not stay consistently close 
to the special position. It’s not good enough to keep moving back and forth, 
close to the special position: you have to get really close, and stay really close. 

So, your step sizes, which are just given by the sequence {a n }, eventually 
have to become very small, at least if you want your series to converge. Math¬ 
ematically, this means that you need to have a n —> 0 as n —> oo. This leads 
us to the nth term test: 


nth term test: if lim n _，oo a n ^ 0, or the limit 
doesn’t exist, then the series a n diverges. 

If ^limjan = 0, then the series may converge or it may diverge, and you have to 
do more work to resolve the issue. Just beware: the nth term test cannot 
I be used to show that a series converges! 

So this test is a sort of reality check: if the terms a n don’t tend to 0, stop 
right there — your series diverges. Otherwise, the problem is still open and 
you need to do more work. For example, we’ll soon see that 


00 i 00 i 

^2 converges, but —= diverges. 

n=l n=l v 

In both sums, the terms converge to 0: 


and 


lim —= = 0. 

n—KX) yjn 


◎ 


The nth term test doesn’t apply in either case! It’s only when the limits are 
not zero that you can use the test to say that your series diverges. Here are 
some examples where the test is good: 


E (_ 3 )"， 


and 


E 


You see, we have 

lim 2 n = oo, 


lim (-3) n DNE, and lim 1 


All three series above diverge by the nth term test, since in each case the limit 
of the terms isn’t 0. Actually, the series are all geometric series, with ratios 
2, —3, and 1, respectively. In general, if you have a geometric series rU 
with r > 1 or r < —1, then the terms r n don’t go to 0 as n —> oo. (We saw 
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this in Section 22.1.2 above — check out the formula in the big box.) So the 
nth term test tells us that any geometric series with ratio not strictly between 
—1 and 1 diverges. 

In a convergent series, although the terms a n must go to 0, that doesn’t 
mean that the limit of the series is 0. For example, the geometric progression 
1 ， 1/2, 1/4, 1/8, … with ratio r = 1/2 converges to 0, and we can actually 
work out the value of the associated series using the formula from the previous 
section: 



So the underlying sequence converges to 0, but the series converges to 2. It 
couldn’t be the other way around — if a sequence converges to 2, then by the 
nth term test the associated series would diverge automatically. 

We’ll see some other examples of the nth term test in Section 23.2 in the 
next chapter. Meanwhile, it’s time to look at some more tests. 


^2A Properties of Both Infinite Series and 

Imp no per Integral's- 

It turns out that there are some connections between infinite series and im¬ 
proper integrals, particularly improper integrals with a problem spot at oo. 
One of these connections is expressed in the integral test, which we’ll look at 
in Section 22.5.3 below. In this section, I want to show you that all four of the 
tests we have for improper integrals also work for infinite series. Let’s look at 
them one at a time. 


22.4.1 The comparison test (theory) 



Suppose that you have a series a n ? where all the terms a n are nonneg¬ 
ative. If you suspect that the series diverges, find a smaller series 
which also diverges and your suspicion is confirmed. That is, if 0 < 6 n < a n 
for all n, and diverges, so does a n. If instead you suspect 

that your original series converges, find a bigger series which also 

converges, and your suspicion is confirmed. That is, if 6 n > a n > 0 for all n, 
and converges, then so does a n. 

This is basically the same as the comparison test for improper integrals. 
The justification of the series version of the test is virtually identical to that 
of the integral version, so I’ll leave it to you to fill in the details if you feel 
sufficiently motivated. 

By the way, the first term in the series doesn’t have to be n = 1: it could 
be anything at all. For example, consider 

oo yi 

SG) 剛 


This is quite easy to deal with, using the comparison test. You see, |sin(n)| < 1 
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for any n, so we can write 

The last sum converges as it is a geometric progression with ratio 1/2, which 
is between —1 and 1. So we can use the comparison test and claim that our 
original series converges. We’ll see some more examples of the comparison 
test in the next chapter. 


224 2' The limit comparison 讎 t. (theory) 

In Section 20.4.1 of Chapter 20, we made the following definition: 

f (x^) 

f(x) ~ g(x) as a; ^ oo means the same thing as lim — ^ 

a：—oo g(x) 

There’s a version of this for sequences that looks almost the same: 


〜 as n —> oo means the same thing as 


lim ■ 


◎ 


The limit comparison test then says that if a n 〜 as n —> oo, and all 
terms a n and b n are finite, then a n and both converge or both 

diverge. You can’t have one without the other. Of course, you don’t have to 
start at n = 1; you could start at n = 0, n = 19, or any other finite value of 
n that you like. Once again, the justification of this test is almost identical 
to the justification of the limit comparison test for improper integrals, so I’ll 
omit it. You can fill in the details if you like. By the way, if a n 〜 6 n as 
n ^ oo, we say that the sequences are asymptotic to each other. 

All the properties of functions we looked at in Chapter 21 are still good 
for sequences. For example, consider 


£ sin (^) 

n=n \ / 


When n is large, l/2 n becomes very small (that is, close to 0). We know that 
sin ⑷〜： r as a: — 0 (see Section 21.4.2 in the previous chapter); replacing x 
by l/2 n , we see that 

sin ( 丄）〜丄 as -r - ^ 0. 

\^2 n ) 2 n 2 n 

Now, we can rewrite l/2 n as (l/2) n , and also note that l/2 n —> 0 is equivalent 
to n —> oo. So the above relation can be written as 


sin 



as n — oo. 


The limit comparison test then says that the two series 
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both converge or both diverge. Now we know that the right-hand series con¬ 
verges, as it is a geometric series with ratio 1/2 (which is less than 1 in absolute 
value). So the left-hand series converges as well. By the way, the right-hand 
series converges to 2 (as we saw in Section 22.3 above); this does not mean 
that the left-hand series also converges to 2. We don’t know what it converges 
to, only that it converges. 


22.4.3 ： Jbe 只 (fh#6ry) 

There’s also a p-test for series. It’s basically the same as the p-test for im¬ 
proper integrals with problem spot at oo. In particular, it says that 

1 J converges if p > 1 ， 
nP 1 diverges if p < 1. 



The easiest proof of this uses the integral test, so I’ll postpone it to Sec¬ 
tion 22.5.3 below. Some simple examples of the p-test are that 



but E 


1 Vn 


diverges. 


The power 2 in the first series is greater than 1, so the series converges. On 
the other hand, since y/n = n 1 / 2 , we have a power of 1/2 in the second series; 
since 1/2 is less than or equal to 1, the series diverges. 

Before we move on to the absolute convergence test, just consider the 
so-called harmonic series 


Y. 1 - 


for a few minutes. This series diverges by the p-test, but we can actually show 
that it diverges directly. The idea is to write out a whole bunch of terms of 
the series and then group them in a clever way. Specifically, the above series 
can be written out like this: 


1 /I 1\ /I 1 1 1 

2 + U + 4j + U + 6 + 7 + 8 

+ G 


10 11 12 13 14 15 16, 


Except for the 1 and 1/2 at the beginning, each grouping has twice as many 
terms as the previous grouping. Now here’s the main deal: the last term in 
each grouping is the smallest. So the above sum is bigger than 



(I 1\ /I 1 1 1、 

VI + iJ + U + 8 + 8 + 8 y 
+ ( - 


16 16 16 16 16 16 




In this new series, there is one term of size 1, one term of size 1/2, two terms 
of size 1/4, four terms of size 1/18, eight terms of size 1/16, and so on. That 
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is, apart from the first term, each grouping adds up to exactly 1/2. So the 
above series is really equal to 


2 


2 


2 


2 



which diverges! Finally, the comparison test shows that the harmonic series 
diverges, since it is bigger than the above divergent series. Now we get for 
free that V nP diverges when p < 1, since l/n p > 1/n and you can use 

the comparison test again. (Try filling in the details.) 


22.4.4 The absolute convergence tost 

Suppose you have a series a n with terms a n which are sometimes pos¬ 
itive and sometimes negative. This kind of sucks; it makes life more difficult 
(or more interesting, depending on your point of view). If eventually all the 
terms a n become positive, then there’s no problem — you can just ignore all 
the terms at the beginning and start the series at the point where all the terms 
are positive. Remember, the beginning terms of a series have no impact on 
whether the series converges or diverges. Similarly, if the terms eventually be¬ 
come negative, you can ignore the beginning terms and end up with a series 
with only negative terms. Then consider the series X^m(— a n), which has 
all positive terms: if it converges, so does the original series, and if it diverges, 
so does the original series. This is because this new series is just the negative 
of the original series. 

So, what if the series keeps switching between positive and negative terms? 
Some examples of this are 





and 


E 


(_ir 


The second and third of these series are actually alternating series. This 
means that the terms alternate between positive and negative numbers. For 
example, the third series can be expanded as 


2 





and you can clearly see that every other term is negative. On the other hand, 
the first series above is not alternating. Sometimes sin(n) is positive and 
sometimes it’s negative, but it doesn’t alternate. For example, sin(l), sin(2), 
and sin(3) are all positive (since 1, 2, and 3 are all between 0 and 7r), whereas 
sin(4), sin(5) and sin(6) are all negative. 

Anyway, there is a special test to deal with alternating series, which we’ll 
look at in Section 22.5.4 below. We still have the absolute convergence test, 
however, which says that if X!^=il a n| converges，so does a n. Again, the 
series can start at any value of n, not necessarily n = 1. Let’s see how this 
works for our above examples. For the first one, 
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consider the absolute version of the series: 

oo /I 

I^|sin(n)| . 



Note that we only needed absolute value signs around sin(n)，since the factor 
(l/2) n is always positive. Anyway, we already used the comparison test in 
Section 22.4.1 above to show that the above series converges. The absolute 
convergence test then says that the original series above (without the abso¬ 
lute values) converges too. In fact, we say that the original series converges 
absolutely. More on this in Section 22.5.4 below. 

For the second series, 



the absolute version is 


E 





This converges by the p-test (since 2 > 1), so the original series converges 
absolutely, by the absolute convergence test. 

For the third series, 



the absolute version is 

oo 1 

v-. 

^ n 


This diverges by the p-test, so you cannot apply the absolute convergence 
test. That is, you cannot conclude that the original series 




diverges. All you can say is that this series does not converge absolutely. In 
fact, in Section 22.5.4 below, we’ll see that the series does in fact converge, 
even though its absolute version diverges! Before we do that, however, we 
have a few other tests to look at. 


"22.5 New Tests for Series 

Let’s look at four tests for convergence of series which have no corresponding 
improper integral version: the ratio test, the root test, the integral test, and 
the alternating series test. We’ll examine them one at a time before seeing 
how to apply them in the next chapter. 
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22.5.1 The ratio tost (theory) 


Here’s a really really useful test which only works for series, not improper 
integrals. It’s called the ratio test because it involves the ratio of successive 
terms of a sequence. Let’s set the scene: suppose we have a series a n. 
We’d like the terms to go to 0 fast enough for this series to converge. Here’s 
one way this can happen: suppose we consider a new sequence, which we’ll 
call b n , of the absolute value of ratios of successive terms of the series. That 
is, we let 

b n = 

for each n. This is a sequence, so maybe it converges to something. Now 
here’s the result: if the sequence {b n } converges to a number less than 1, 
then we can immediately conclude that the series a n converges. In 

fact, it converges absolutely: that is, X)^=il a n| also converges. On the other 


an+i 


hand, if the sequence {b n } converges to a number greater than 1, then the 
series a n diverges. If the sequence {b n } converges to 1, or if it doesn’t 
converge, then we can’t say anything about the original series. 

We’ll look at a lot of examples of the ratio test in the next chapter, so 
let’s just see if we can justify the test. This is a tricky argument, so don’t 
worry if you get lost — just skip to the next section. Let’s give it a try, though. 
We might as well assume that a n > 0 for all n, so we can drop the absolute 
values. Suppose that b n converges to a number L which is less than 1. That 
is, suppose that 

^■n+l r / i 

- y L < 1 as n ^ oo. 


a n 


Well, this means that when n is large, the ratio a n +i/a n is approximately 
equal to L. If this ratio were exactly equal to L, then the series would be a 
geometric series with ratio L, which converges since L < 1. Since it’s only 
equal to L in the limit, we have to be a bit more clever. 

The idea is to let r be equal to the average of L and 1. Since L < 1, the 
average r lies between L and 1, so r is also less than 1. That is, L < r < 1. 
So what? Well, since the ratio a n+ i/a n converges to L, eventually it must 
always be less than r. That is, the ratios can wander around doing whatever 
they like for a while, but then they get serious and start getting close to L. 
You can’t get close to L without being less than r, since r is bigger than L. 
So, the point is that if you throw away enough of the series at the beginning, 
you can always say that a n +i/a n is less than r. 

Let’s see where we’re at: we started off with a n , but we’ve thrown 
away a whole bunch of the terms at the beginning to get a n for some 

number m. This throwing-away routine doesn’t affect whether the series con¬ 
verges. On the other hand, it helps because we are sure that a n +i/a n < r for 
all n> m. Another way of writing this is a n +i < ra n for all n> m. 

Now comes the real meat: the sequence {a n } is dominated by a geometric 
progression with ratio r. After all, to advance from one term a n to the next 
term a n +i, you multiply by some number less than r (since a n +i < ra n ). On 
the other hand, to advance from one term of a geometric progression with 
ratio r to the next term, you actually do multiply by r. So if the geometric 
progression starts at a m , then it pulls ahead of our sequence {a n } and stays 
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ahead. (All this can be justified by using induction. Assume that a n < Ar n . 
Then multiply both sides by r to get ra n < Ar n+1 . Since a n+ i < ra n , we 
have a n+ i < Ar n+1 . Now you just have to choose A so that a m < Ar 771 as 
well; any number greater than a m /r m will do.) 

OK, we’ve shown that a n < Ar n for some number A. This means that 


Since 0 < r < 1, the right-hand side converges, so, by the comparison test, 
the left-hand side converges too. Finally, by the absolute convergence test, 
Y^=m a n also converges even if some of the terms a n are negative after all. 

Not so easy. Luckily the divergence version isn’t so bad. Suppose that 
the ratios \a n+ i/a n \ converge to a number L which is bigger than 1. Now if 
we throw away enough of the series, we can just look at Y^=m\ a n\-> where 
m is large enough to force |a n+ i/a n | > 1 for all n > m. This means that 
|a n +i| > \a n \ for all n > m. The terms \a n \ are actually getting bigger as n 
gets larger, so we can’t possibly have^im^an = 0. Now we can just use the 
nth term test to say that Y^=m diverges, so a n also diverges. 

Now all that’s left is to convince ourselves that everything breaks down 
if 1/ = 1. Here’s a good example of what can go wrong: consider the series 
V nP . Let’s work out the ratio of successive terms: 


^n+l 


(n + 1)^ — n p 
~~ - (n + 1)P 



We were able to drop the absolute value signs since everything is positive. In 
any case, as n 一 oo, it’s easy to see that n/(n + 1) ^ 1, so the pth. power 
also goes to 1. That is, 


lim 

n^oo a n 



=F = 1. 


So the limit L of the ratios is 1, regardless of what p is. Now, we know 
that l/ nP converges if p > 1 and diverges if p < 1. The limiting ratio 

L = 1 cannot distinguish between these two possibilities. This one example 
is enough to show that if L = 1, then the original series could converge or it 
could diverge: you just can’t tell. 


22.5.2 The root test (theory) 

The root test (also called the nth root test) is a close cousin of the ratio test. 
Instead of considering ratios of successive terms, just consider the nth root of 
the absolute value of the nth term. That is, starting with a series a m 

let’s make a new sequence given by 

b„= \a n \ x / n . 

(Remember, raising a quantity to the power 1/n is the same as taking the nth 
root.) Now you see whether the sequence {b n } converges and try to find the 
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limit. If the limit is less than 1, then the series a n converges (in fact, 
converges absolutely). If the limit is greater than 1, the series diverges. If the 
limit equals 1, then you can’t tell what the heck is going on and have to try 
something else. 

Again, we’ll look at an example in the next chapter. Let’s try to justify 
what’s going on here. If this seems a little nasty, just skip to the next section. 
Anyway, the main idea is that the test is again inspired by looking at a 
geometric series. Suppose that a n = r n . Then the nth root of \a n \ is exactly 
\r\. So the series converges if \r\ < 1 and diverges otherwise. Now, we don’t 
exactly have a geometric series but it’s pretty close. Let’s start off with the 
assumption that 

^lim^lanl 1 / 71 = L < 1 as n ^ oo. 

By the same logic we used in the justification of the ratio test, we let r be the 
average of L and 1 and realize that eventually |a n |" n < r. That is, after a 
certain point n = m in the series, \a n \ < r n . So we have 

n=m n=m 

Since r < 1, the right-hand series converges and we can use the comparison 
test to show that the left-hand series converges as well; so a n converges 
absolutely. 

On the other hand, suppose that the limit L is greater than 1, that is, 
lim lani 1 / 71 = L > 1 as n — oo. 




Eventually for large enough n, it’s always true that janj 1 / 71 > 1, which means 
that \a n \ > 1. So Y^=i a n diverges by the nth term test, since the terms 
can’t go to 0. 

If the limit L is exactly 1, the test is still utterly useless. Again the example 
l/ nP illustrates this pretty clearly. I leave it to you to show that 


lim 


lim n~ p/n = 


(Treat it as a PHopital Type C problem; see Section 14.1.5 of Chapter 14 to 
learn about this type of problem.) We know the series l/^ p diverges 

for some values of p and converges for other values of p. It follows that the 
root test can’t possibly give any useful information, since the above limit is 
1, no matter what p is. 


' 澤 5 3 The inf • 調龜爾 . (theory) 

We already saw in Section 22.4 above that there’s a connection between im¬ 
proper integrals and infinite series. The integral test really nails down this 
connection. In particular, suppose you have a series a n whose terms 

a n are positive and decreasing. By “decreasing,” I mean that a n _i_i < a n 
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for all n. (Technically, I should say “nonincreasing” since the inequality isn’t 
strict.) An example of such a series is V nP f° r any p > 0: the terms 

are certainly positive, and it’s easy to see that they are also decreasing. Let’s 
draw a picture of the general situation: 




ai 


0-2 

a 3 

a 4 

a 5 


o 6 

Q8 


3 4 


5 


6 


8 


n 


The axes are actually labeled n and a n instead of x and y. The idea is that 
the height of the dot above the number n is the value of a n . Notice that all 
the dots are above the x-axis (actually, the n-axis!) since all the terms a n are 
positive; also, the heights are getting smaller, so the terms a n are decreasing. 

Now, imagine you can find some continuous function / that is decreasing 
and connects the dots: 



Since the curve y = f(x) passes through every dot, we have f(n) = a n for all 
positive integers n. Now consider the integral 

/•OO 

J f(x) dx. 

If that integral converges, so does the series a n- Why is it so? Well, 
let’s draw some sneaky lines in the picture: 









angle has a base of ] 
Lnits, as units, un 
.)The total area of ‘ 
t be some finite num 


e heights of the rectangles 
old ai doesn’t get a recta 
square units) is a n . r 
n test, since 




as n a … fRemember. 
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and the comparison test now shows that Y^=i a n diverges. 

In summary, we have the integral test: if / is a decreasing positive 
function such that f(n) = a n for all positive integers n, then 

f(x) dx and 

n=l 

either both converge or both diverge. Again, the series can start at any 
number, not just n = 1; just change the lower bound of the integral to match. 
We’ll see some examples of how to use the integral test in the next chapter, 
but for the moment we can at least use it to prove the p-test for series, which 
we first saw in Section 22.4.3 above. 

So, to investigate the convergence of l/n p , first suppose that p > 0 
and consider the function / defined by f(x) = l/ar p for ar > 0. This function 
clearly agrees with l/n p when x = n, and it’s also decreasing. (One way of 
showing this is by considering the derivative. In this case, f f (x) = —px p_1 
which is a negative quantity for a; > 0, so / is decreasing.) Anyway, we can 
now use the integral test to say that 

poo i °°^ 1 

/ — dx and > — 

h xp 

either both converge or both diverge. Which is it? Well, when p > 1, the 
integral converges by the integral p-test, so the series does as well. When 
0 < p < 1, the integral diverges by the integral p-test, so the series diverges 
as well. 

How about when p < 0? Then you can’t use the integral test, since the 
function / given by f(x) = l/x p is actually increasing. You see, if p < 0, then 
we can write p = —q for some q > 0. Then 

OO n OO n oo 

y- = y — = 

n p n ~q 

n=l n=l n=l 

This last series diverges by the nth term test, since n q oo (not 0) as n — oo. 
Finally, if p = 0, the series l/^ p is just 1 = l + i + l + … .， 

which clearly diverges. Putting everything together, we see that l/n p 

converges when p > 1 and diverges when p < 1, which is exactly the p-test 
for series! 

22.5.4' altemotinp series 

Suppose you have a series a n where the terms can’t make up their 

minds whether to be positive and negative but instead keep switching sign. 
We already saw some examples of this in Section 22.4.4 above. Sometimes 
the absolute convergence test saves the day here, since if the absolute version 
H==il a n| converges, so does our original series. But what if the absolute 
version diverges? What on earth do you do then? 

This is quite a question. There’s no easy answer, in general. This is a 
tricky little topic which has inspired much thought and discussion over the 
years. Let’s be happy with a simple test that comes up surprisingly often in 










ns. ouppose x-nai your series is aix-ernaiing. itememDer, tms means 
r second term is positive and every other term is negative. If you 
series with positive terms and multiply each term by (—l) n ，then 
a alternating series. (You could use (—l) n+1 instead.) Two of the 
Looked at above, 


and 


E 


-l) n 


lternating. We have already seen (in Section 22.4.4) that the first of 
series converges absolutely, so it converges. The second one is more in¬ 
king. It doesn’t converge absolutely, since its absolute version Y^=i l/ n 
ges. Amazingly, it turns out that the original series Y^=i{~^) n / n con- 
si When a series converges, but its absolute version diverges, we say that 
3ries converges conditionally. So l) n /n converges conditionally, 

see why. 

he alternating series test says that if a series a n is alternating, and 

bsolute values of its terms are decreasing to 0, then the series converges, 
is, we need a n to be alternately positive and negative, and \a n \ to be 
asing, and^lin^lanl =0. In that case, the series converges. So, for 
pie, the above series l) n / n converges since it is alternating, and 

bsolute values of the terms are {1/n}, which is a decreasing sequence 
limit 0. We’ll summarize the test and see some more examples of the 
lating series test in Section 23.7 of the next chapter. 

does the test work? Well, first let’s just do a reality check. One of 
onditions is that the limit of the terms of the series has to be 0. If that 
true, then the series diverges by the nth term test! So that condition’s 
brainer. Now, here’s how the rest of it works. Consider the partial sums 
►, where An = ^2^=i a n. Because a n keeps alternating between positive 
Legative values, the partial sums An wobble back and forth. Think back 
e idea of the megaphone guy telling you to step back and forth: every 
d call he makes is a forward steo. and every other call is a backward steD. 
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five values of a n . But As is the sum of the first three values, so we can write 
= A 3 + a 4 + « 5， (If you know where you are after three steps 一 namely 
As — then you can just take the next two steps of signed length and to 
see where you are after five steps, which is A^.) Anyway, «4 + a 5 < 0, since 
a 4 is negative, as is positive, and |« 4 | > |as|. This means that A 5 < A^. If 
you continue this process, then you find that 

> ^3 > ^5 > ^7 > * • •, 

so your right foot is indeed moving farther back as time goes by. 

You can repeat the same argument (but in the opposite direction) with 
the even terms A 2 , Aq, and so on. Try it and see if you can show that 

^2 < ^4 < ^6 < ^8 < • • •, 

so your left foot is moving forward as time goes by. Now, here’s the main 
point: the odd sequence Ai^As, A 5 , ... is decreasing, so either it drops off to 
—oo, or it converges to some finite value. It can’t drop off to —oo, though, 
because all these terms are bigger than A 2 . (Why is that true?) Similarly, 
the even sequence A 2 ,A 4 , Aq, ... is increasing, so either it blows up to oo 
or it converges. It can’t blow up to oo since all these terms are less than 
A\. (Again, why?) So both the odd and the even series converge. Since the 
differences \a n \ between odd and even terms are getting smaller, the limits of 
both series must be the same! That is, the odd series decreases to the same 
limit that the even series increases to: your feet are moving closer and closer 
together until they are arbitrarily close together. That’s all that you need 
to show that the full sequence {Ajv} of partial sums converges, which means 
that the original series Y^=i a n converges too. 

So the alternating series test works. It’s important that you only use it 
after checking that your given series is not absolutely convergent. We’ll see 
how this works in the next chapter when we look at lots of examples. 












CHAPTER 23 


How to Solve Series Problems 


The scenario: you are given a series a n , and you want to know whether 
or not it converges. If it does converge, then perhaps you’d like to know its 
value (that is, what it converges to). The series has to be pretty special in 
order to find a nice expression for its value. Of course, the series may not 
start at n = 1 as in the above series — it could be n = 0 or some other value 
of n. 

This chapter is all about giving you a blueprint of how to proceed. Here’s 
a possible flowchart for how to approach a series: 

1. Is the series geometric? If your series only involves exponentials like 
2 n or e 3n , it might be a geometric series, or it might be the sum of one 
or more geometric series. See Section 23.1 below to see how to deal with 
this case. 

2. Do the terms go to 0? If the series isn’t geometric, try the nth term 
test. Check that the terms converge to 0; otherwise the series diverges 
by the nth term test. See Section 23.2 below for more details. 

3. Are there negative terms in the series? If so, you may have to use 
the absolute convergence test or the alternating series test. See 
Section 23.7 at the end of this chapter for more information. 

4. Are factorials involved? If so, use the ratio test. The test is also 
useful when there are exponentials involved but the series isn’t geomet¬ 
ric. See Section 23.3 below. 

5. Are there tricky exponentials with n in the base and the ex¬ 
ponent? If so, try the root test. In general, if it is easy to take the 
nth root of the term a n , the root test is probably a winner; check out 
Section 23.4 below for more details. 

6. Do the terms have a factor of exactly 1/n as well as logarithms? 
In that case, the integral test is probably what you want. We’ll look 
at this test in Section 23.5 below. 

7. Do none of the above tests seem to work? You may have to use the 
comparison test or the limit comparison test in conjunction with 
the p-test, as well as all the understanding of the behavior of functions 
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which we looked at in Chapter 21. We’ll see how to apply these tests in 
Section 23.6 below. 

The above blueprint will help guide your way through a lot of different series. 
It’s not perfect! There are always tricks and traps that could arise. Hopefully 
these will be pretty rare. My advice is to master all this material, then worry 
about the once-in-a-blue-moon cases as you come across them in your studies. 
Anyway, let’s get on with the details. 


23.1 Howto EvaluoteGeonrlptric Senes 


If your series only involves exponentials like 2 n or e 3n , it might be 
the sum of one or more geometric series. As we saw in the previous 
chapter, geometric series are simple enough that you can actually find their 
values (if they converge). The general form of a geometric series is Y^=m ar * n , 
where r is the common ratio. On page 485, we saw how to find the value of the 
series. Rather than learn the formula in mathematical language, I recommend 
learning it in words: 






sum of infinite geometric series = — 


if - 

1 < ratio < 1. 

1 

— common ratic 

) 




If the common ratio isn’t between —1 and 1, then the series diverges. 

Let’s see how it works. Suppose you want to find 

This is a geometric series, since you can write 



From this, we can see that the common ratio is 1/3. This ratio is between —1 
and 1, so the series converges. To what, you ask? Well, the first term occurs 
when n = 5, so it is 4/3 5 . So 


f 4 U 1 、' 4 / 35 



which works out to be 2/81. 
Here’s a trickier example: 


E 


2 2n - (-7) n 


This is not a geometric series, but it can be split up into the difference of two 
geometric series: 


E 


2 2n — (-7) n 


22n 




(- 7 )" 

ll n . 








Why are both these pieces geometric series? In the first series, you can replace 
2 2n by 4 n ，then express 4 n /ll n as (4/ll) n . This last trick also works in the 
second series, so we have 



Both these series converge, since their common ratios are 4/11 and —7/11 
(respectively) and both these numbers are between —1 and 1. So we can use 
the above formula. The first terms occur when n = 2, so they are (4/11) 2 and 
(—7/11) 2 , respectively. All in all, the series works out to be 

(4/11) 2 (-7/11) 2 

l-(4/ll) 1 一 (-7/11 )， 

which simplifies to —5/126. 

How about if we change the problem slightly? Consider 
y^2 2n - (—13) n 

n=2 

Again, we can split up the sum and group terms to rewrite this as 

- 

n=2 \ 7 n=2 \ 7 

Don’t even bother working out the first series — just notice that it converges, 
but the second one diverges since the ratio —13/11 isn’t between —1 and 1. 
The sum of a divergent series and a convergent series must diverge! 

As we’ve seen, geometric series are fairly easy to deal with. If your series 
isn’t geometric, keep working your way down this list, beginning with the nth 
term test. 

How to Use th&n#|iernn Test 

Always try the nth term test first! The test says: 


if lim a n ^ 0, or the limit doesn’t exist, then the series a n diverges. 


If the terms of your series don’t tend to 0, the series must diverge. If the terms 
do tend to 0, the series might converge or it might diverge: you have to do 














_s of the series don’t go to 0 an 
?st. 

;rms of your series do tend to ( 


rials involve exclamation points, such as in n! or (2n+5)!. The ratio test is also 
often useful when there are exponentials around, such as 2 n or (—5) 3n . Here’s 
the statement of the test, summarized from what we found in Section 22.5.1 




Make sure you use a bigass fraction bar, since you may have to write a fraction 
over a fraction. The nth term of the series is just a n , whereas if you replace 
n by (n + 1) wherever you see it, you get a n +i instead. Anyway, now you 
have to find the above limit; let’s say you’ve done that and got an answer L. 
There are three possibilities: 

1. If L < 1, then the original series a n converges; in fact, it converges 
absolutely. 

2. If L > 1, then the original series diverges. 

3. If 1/ = 1, or the limit doesn’t exist, then the ratio test is useless. Try 
something else. 

Now let’s look at some examples. First, consider 
00 n 1000 
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It’s not a geometric series because the numerator is a polynomial. Since 
exponentials grow faster than polynomials (see Section 21.3.3 in Chapter 21), 
the limit of the nth term is zero: 

1000 

lim - = 0. 

n^oo 2 n 

So we can’t use the nth term test. Since the series involves exponentials, let’s 
try the ratio test. Following the standard framework, we start with 



(n ； ±i) 1000 

an+1 = lim 

2n+l 

| a n | n—oo 

n 誦 


Notice that the denominator is just the nth term, copied directly from the 
original series. The numerator is the same as the denominator, except that we 
have replaced every occurrence of n by (n + 1). Now, it’s good technique to 
simplify the above expression by inverting and multiplying, grouping similar 
terms together as you do so. The above expression works out to be 


lim 


Kn + 1) 1 




2 n+l 


Note that we dropped the absolute values (everything’s positive), and we also 
grouped the 1000th powers together and used the fact that lim (n+l)/n = l. 
Anyway, the above limit is 1/2, which is less than 1, so the original series 
converges by the ratio test. End of story. 

Now consider 

f 丄 

士， ln(n) 

You should be able to show that the terms go to oo as n ^ oo, so the series 
diverges by the nth term test. Suppose that you just try the ratio test right 
off the bat. This still works: 


^-n+l 


=lim 


3 n+l 


(n + l)ln(n+l)= 此 3 n+1 

3 n n^oo 3 n n + 1 ln(n 

n ln(n) 


ln(n) 


1) 


sedjirn^n/(n + 1) = 1, which is easy, and also n lim o ln(n)/ln(n + 1) = 1, 
which is not. You should try using l’Hdpital’s Rule to convince yourself 


We use 

TX ^ OC» r v ' 

You should try using PHopitaPs Rule to convince yourself that 
this last limit is true. Anyway, the limiting ratio above is 3, and since 3 > 1, 
the original series diverges. So even though we didn’t use the nth term test, 
the ratio test sufficed anyway. 

The ratio test is particularly useful when dealing with factorials. Remem¬ 
ber that n! is the product of the numbers from 1 to n inclusive: 

n! = 1 x 2 x 3 x ... x (n — 1) x n. 

When using the ratio test with factorials, you will often have to consider ratios 
such as 

n! 


(n + 1)!. 
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Now we know from above that (n + l)!/n! simplifies down to (n + 1 )， so the 
above quantity is 

lim (n + 1) 产 + f 二 . 

n-^oo K (n-\- 4) n+1 

Now what the heck do you do? This is pretty tricky. How about writing the 
denominator as (n + 4) x (n + 4) n so that we match the power of n in the 
numerator? Then we can group the terms like this: 

lim (n + l)/ n + f" = lim 打 + 1 f + 

n^oo v (n + 4) n+1 rwoo n + 4 (n + 4) n 

Now the plot thickens. The first factor, (n + l)/(n + 4 )， clearly tends to 1 as 
n —> oo, but the second factor is trickier. One way to handle it is to replace 
n by x and consider the limit 


Following the PHopitaPs Rule Type C method (see Section 14.1.5 in Chap¬ 
ter 14), we find the limit of the logarithm (after a bit of clever algebra): 


lim In 


lim 

x-^oo \X + 4/ 


=lim 


lim ■ 

c—oo 

ln($ + 3) — ln(a: + 4) 


㈣ 

1/x 


1/x 


The numerator goes to 0 as x ^ oo, since (a: + 3)/(a: + 4) ^ 1 and ln(l) = 0. 
The denominator also goes to 0, so I leave it to you to use PHopitaPs Rule to 
show that 

lim In =-l. 

x^-oo yx H- 4y 

Exponentiating and changing x back into n, we have shown that 


So, we now have all the pieces of the puzzle at our disposal. The limiting 
ratio above works out to be 


lim 


lim - 
n-^oo n + 4 \n + 4y 


Since this limit is less than 1, the original series converges. 
How about 


E ； 


㈣ nln ( n ). 

The terms certainly go to 0 as n — oo. Let’s try the ratio test: 


(n + l)ln(n+ 1) 

I 

n ln(n) 


ln(n) 


n^oo n + 1 ln(n + 1) 
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23.5 How to Use the Integral Test 


Use the integral test when the series involves both 1/n and ln(n). 

In Section 22.5.3 of the previous chapter, we saw that if N is any positive 
integer, then we can say: 


if a n = f(n) for some continuous decreasing function /, then 

oo rOO 

a n and / f{x) dx either both converge or both diverge. 

n=N Jn _ 

In practice, here are the steps involved in using the integral test. 

• Replace n by x, change into /i°°, and put a dx at the end. Of 

course, if the series begins at n = 2, then you use J 2 °° instead, for 
example. 


• Check that the integrand is decreasing; you can do that by showing that 
the derivative is negative, or just by inspecting the integrand directly. 



• Now deal with the improper integral from the first step. The main 
advantage of integrals over series is that you can use a substitution (or 
change of variables, if you prefer) in an integral. The most common 
substitution in this context is t = ln(a:). 

• If the improper integral converges, so does the series. If the integral 
diverges, the series diverges too. 

For example, consider 

oo 1 

v 1 

—n ln(n) 

n=2 v 1 


We have already looked at this series — in fact, we tried to use the ratio test 
at the end of Section 23.3 above, with no success. Let’s try the integral test 
instead, which is suggested by the presence of the factor 1/n and the presence 
of ln(n). Change the variable n to x, and the sum to an integral, to get 


instead. The integrand l/(x ln(x)) is indeed decreasing in x\ you can show this 
by differentiating and seeing that the derivative is negative, or more directly 
by observing that x and ln(x) are both increasing in x, so their product 
a:ln(:r) is as well, so the reciprocal l/a:ln(:r) is decreasing in x. Anyway, we 
have already looked at the above improper integral in Chapter 21, but here’s 
the solution outline once again: substitute t = ln(a:), so dt = 1/xdx^ and the 
integral becomes 

r°° i 7 

/ 7 汾， 

which diverges by the p-test for integrals. Since the integral diverges, so does 
the original series (by the integral test). 








510 • How to Solve Series Problems 


◎ 


On the other hand, let’s modify the series slightly: consider 


E 


n(ln(n)) 2 


Again, we have a factor 1/n and logarithms are involved, so try the integral 
test. Replace n by x and turn the series into an integral to get 

L x(ln(x)) 2 dX ' 

Try to convince yourself that the integrand is decreasing in x. Substitute 
t = ln(x), and this time the integral becomes 

h dt ， 

which converges by the p-test. So this time the series converges (by the 
integral test). Looking at this example and the previous one together, we 
can really see just how subtle this whole business of convergence of series is. 
We know ln(n) is pretty small compared to any positive power of n as n gets 
large, but the above examples together demonstrate that a log can make a 
difference. One extra measly power of ln(n) thrown into the denominator of 
Y ^=2 1/^ ln(n) turns it from a divergent series into a convergent series. (We 
looked at a similar example in Section 21.3.4 of Chapter 21.) 



M.6 Howto Usofhe Comparison Test,|ti#l!unnit 
Comparison Test andthep-te, ’ 

Use these tests for series with positive terms when none of the 
other tests seem to apply. You definitely want to try the nth term test 
first, then use the ratio test if factorials are involved, the root test if the 
terms have exponentials where the base and exponent are both functions of 
n, or the integral test if you have a factor of 1/n and logarithms are involved. 
What does that leave? Basically the same tools as you have for integrals: the 
comparison test, the limit comparison test, thep-test, and an understanding of 
how common functions behave near 00 and near 0. You really need to review 
Chapter 21 before looking at this section, since the techniques are almost 
identical. In any case, here are the tests once more. (For the comparison and 
limit comparison tests, we assume all the terms a n are nonnegative.) 

1. Comparison test, divergence version: if you think a n di¬ 

verges, find a smaller series which also diverges. That is, find a positive 
sequence {b n } such that a n > b n for all n, and such that di¬ 

verges. Then 

^2a n >^2b n = oo, 


so Yln=i a n diverges. 
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2. The comparison test, convergence version: if you think a n 

converges, find a larger series which also converges. That is, find {b n } 
such that a n < b n for all n, and such that Y^Li converges. Then 

〉 : S 〉: b n < oo, 

so Y^Li a n converges. 

3. Limit comparison test: find a simpler series so that 〜 6 n 

as n —> oo. Then if converges, so does a n- On the other 

hand, if diverges, then so does a n. (Remember that 

u a n 〜 6 n as n — oo” means the same thing as ^linia n /6 n = 1.”) 

4. p-test: if a > 1, the series 


E 


converges if p > 1 ， 
diverges if p < 1. 


This is the same as the f°° version of the p-test for integrals. 

Now let’s look at some examples. In each example below, you could replace 
the sum by an integral and get an improper integral (with problem spot at 
oo) instead of a series. The solutions to the improper integral problems are 
identical to the corresponding solutions for the series below. In each case, you 
should try to write down the equivalent problem and solution for the improper 
integral version. It’s also a good idea to look back at Chapter 21 and try to 
convert every improper integral with problem spot at oo to a series. Almost all 
of them can be solved using the above tests. (The exceptions are the problems 
whose solutions involve the change of variables t = ln(a:); for those problems, 
you’d need to use the integral test in order to solve the corresponding series 
problems.) Anyway, consider the series 


E 


2n 2 + 3n + 7 
n 4 + 2n 3 + 1 * 


To examine this, note that the highest term in each polynomial dominates, 
since n is getting larger and larger. (See Section 21.3.1 of Chapter 21 for more 
details.) So we have 


2n 2 H- 3n H- 7 
n 4 + 2n 3 + 1 


2n 2 _ 2 
n 4 n 2 


By the p-test, 2/n 2 converges (the constant 2 is irrelevant); so by the 

limit comparison test, the original series above converges as well. 

A slight technicality arises in the almost identical example 


E 


2n 2 + 3n + 7 
n 4 + 2n 3 + 1 * 


The only difference between this and the previous example is that the sum 
now begins at n = 0. If you use the same argument as in the previous 
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example, you may find that you are comparing the above series with the 
series 2/n 2 . This last series isn’t well-defined, since its first term looks 
like 2/0, which really sucks. You can avoid this issue by one of two methods: 
you could just say that you’re changing the first term n = 0 into something 
else, like n = 1, and this doesn’t affect the convergence. Alternatively, you 
could break off the term n = 0 from the sum. Indeed, when n = 0, the 
quantity (2n 2 + 3n + 7)/(n 4 + 2n 3 + 1) is just 7, so 


2n 2 H- 3n + 7 _ 2n 2 + 3rH- 7 

^ 0 n4 + 2n3 + l - + Z. n 4 + 2n3 + l* 



By our standard ideas about higher powers dominating, we have 



we have picked the exponent 1002 since it’s 2 bigger than the exponent 1000 
in the question. We now have 


E 2-^ 000 <£ ‘n 1 。。。 = cf ： l<oo, 

n=l n=l n=l 

where the last series converges by the p-test. So the original series converges 
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This is just the series version of an example on page 465. In fact, you could 
use the integral test to convert this series problem into the improper inte¬ 
gral problem there, since the integrand is decreasing, but what would be the 
point? We might as well just solve it directly. As we did in the improper 
integral example, we use ln(n) < Cn 0 0005 , where we have cunningly chosen 
the exponent 0.0005 so small that you can take it away from the exponent 
1.001 (which arises in our series) and still be greater than 1. So we have 

^ ln(n) ^ ^ Cn 0 - 0005 f 1 一 

2 ^ n 1.001 - 2 ^ n 1.001 = G 2 ^ n 1.0005 < °°， 
n=l n=l n=l 


where the last series converges by the p-test. So our original series converges 
by the comparison test. 

The series 

|sin(n)| 
n 2 

is pretty easy to deal with. Remembering that |sin(n)| < 1, we see that 


E 


E 


|sin(n)| 

n 2 


<E 


< oo. 


So the series converges by the comparison test. 
Now consider the series 



It may look like some of the terms of the series might be negative, but that’s 
a load of bull. Indeed, when n starts at 1 and works its way up the positive 
integers, the numbers 1/n start at 1 and work their way down toward 0. That 
is, 1/n is always between 0 and 1. Since sin (a:) is positive when x is between 
0 and 1, the series has all positive terms! So what? We still haven’t done 
the problem. How do we proceed? In Section 21.4.2 of Chapter 21, we saw 
that sin(a;) 〜： r as a: — 0. Replacing a: by 1/n, we see that sin(l/n ) 〜\卜 
as 1/n — 0. Wait a second ― when 1/n — 0, we must have n ^ oo. That 
is, we have shown that sin (1/n) 〜 l/nasn—>oo. This is exactly what we 
need! Since the series l/ n diverges, the limit comparison test shows 

that our original series above diverges too. (Compare this example with the 
last example in Section 21.4.2 of Chapter 21.) 

On the other hand, the series 



converges, since sin 2 (1/n) 〜 1/n 2 as n — oo; you get to fill in the details. 
Finally, a really nasty series: 

g ： cos 2 (n)tanf (n2 / + 7 4n - 4 3)ln(n) y 
^ V y/n 7 + 2n 4 + 3n J 
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How do you approach this? Consider the pieces of this series. As n — oo, 
the (n 2 + 4n — 3) factor is asymptotic to n 2 , and the Vn 7 - 2n 4 + 3n factor 
is asymptotic to Vn7, which is just n 7 / 2 . So we can say that 

(n 2 + 4n — 3) ln(n) n 2 ln(n) ln(n) 

- ~~, -〜—— ”上 = 0/0 as n — oo. 

Vn 7 + 2n 4 H- 3n n 7 / 2 n 3 / 2 

On the other hand, both sides of this above relation go to 0 as n gets large 
(remember, logs grow slowly!). So we can use the relation tan(x) 〜 a: as a: —> 0 , 
with x replaced by the horrible quantity (n 2 + 4n — 3) ln(n) /y/n 7 + 2n 4 H- 3n, 
to get 

( (n 2 + 4n — 3) ln(n) \ (n 2 + 4n — 3) ln(n) ln(n) 

tan —— , 〜—— , 〜 0/0 as n — oo. 

V Vn 7 + 2n 4 H- 3n / Vn 7 + 2n 4 + 3n n 3 / 2 

Now let’s concentrate for a moment on the series 



We need to use the fact that logs grow slowly to make the ln(n) insignificant 
compared to the n 3 / 2 term (see Section 21.3.4 of Chapter 21 for more details). 
Specifically, we like the power 3/2 in the denominator and don’t want this to 
be 1 or smaller. So let’s use ln(n) < Cn 1 ^ 4 (the power here just needs to be 
less than 1/2) and see that 


ln(n) Cn 1 , 4 — C 
- = ^ 574 ' 


So, summing everything up, we see that 





n=2 


converges by the comparison test. Now look back at the asymptotic relation 
way back above; the limit comparison test now implies that 


tan 

n=2 


( 


(n 2 + 4n — 3) ln(n) 



also converges. Great — we’re nearly done. How about that cos 2 (n) factor? 
That’s not too helpful, since it keeps oscillating. We do know that it’s less 
than or equal to 1, and that it’s positive (since it’s a square). So we’ll just 
use cos 2 (n) < 1 and see what we get. In fact, 




(n 2 + 4n — 3) ln( 


y/n 7 + 2n 4 - 
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as we have just shown that the right-hand series converges. So, our origi¬ 
nal series converges by the comparison test. How about that — we used the 
comparison test twice, the limit comparison test once, and the p-test twice. 
Tricky stuff; but if you can do that sort of problem on your own, then you 
should be able to do just about any problem involving these three tests. 


23.7 Howto Deal with Series with Negative Terms 

Suppose some of the numbers a n which appear as terms in your series are 
negative. Here are some ways to handle this situation: 


◎ 


1. If all the terms a n are negative, then modify the series by putting 
a minus sign in front of all the terms. The modified series is a n ), 

which has all positive terms. Then you can use the techniques above to work 
out whether the modified series converges or diverges. Then if the modified 
series diverges, so does the original one we are interested in, whereas if the 
modified series converges, then the original one also converges. In fact, if the 
modified series converges to L, then the original one converges to —L, since 
the modified series is just the negative of the original series. For example, 
does the series 




yjn 


converge or diverge? Well, 1/n is near 0 when n is large, so taking log of 
it will give a negative number. (Remember, ln(a:) < 0 if 0 < a; < 1.) It’s 
therefore easier to consider the modified series 




which is actually the same thing as 




since — ln(l/n) = — (ln(l) — ln(n)) = ln(n). Now, what’s our intuition about 
this? If this series were just 

oo 1 

it would diverge by the p-test. Normally logs don’t make much difference, but 
this isn’t always true —— remember the examples from the integral test above. 
In any case, this particular log actually helps the series to diverge, since it 
blows up as n — oo. The basic logic is that as n ranges from 3 upward, the 
least that ln(n) can be is ln(3), so we have 

ln(n) > ln(3) 

for any n > 3. In our series, it follows that 
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• the absolute values of the terms \a n \ are decreasing in n (so the un¬ 
derlying sequence is getting smaller and smaller, in terms of absolute 
value). 


If all three of these properties are true, then the series converges. Note: you 
should always try the absolute convergence test first. If the series 
converges absolutely, do not use the alternating series test! Also, 
notice that the second property is just the nth term test in disguise; this 
follows from the fact that n ljpjja n | = 0 if and only if^lirr^an = 0. So even if 
you forget to try the nth term test first, you have to do it anyway as part of 
the alternating series test. 

Here’s a classic example: 




The absolute version is Y^=i l/ n , which diverges by the p-test; so our series 
does not converge absolutely. Let’s dive straight in to the alternating series 
test. We need to check the three properties. First, is the series alternating? 
Yes. A series is automatically alternating if it has terms which look like 
(—l) n or (—l) n+1 multiplied by a positive number. In this case, the nth 
term is (—l) n multiplied by the positive number 1/n. How about the second 
property? We need to show that 


1 ^ 1 = 0 . 


This is obviously true, since |(—l) n /n| = 1/n. As for the third property, 
we need to show that {|(—l) n /n|} is a decreasing sequence. This is pretty 
straightforward, again since |(—l) n /n| = 1/n and we know that 1/n is de¬ 
creasing inn. So the alternating series test applies and shows that the original 


(- 1 )" 


converges. Since we’ve already checked that it doesn’t converge absolutely, 
we know that it converges conditionally. 

On the other hand, consider the series 


^ (~l) n 

n=l 

The absolute version is V n2 , which converges by the p-test. So the 

above series converges absolutely, and there’s no need to waste time with the 
alternating series test. 

Let’s see a couple more examples. First, let’s look at 

I”、 (£). 

This is very similar to an example we looked at on page 513 above, except 
that we now have an extra factor of (—l) n . The first thing to do with our 















CHAPTER 24 


Introduction to Taylor Polynomials, Taylor Series, 
and Power Series 


We now come to the important topics of power series and Taylor polynomials 
and series. In this chapter, we’ll see a general overview of these topics. The 
following two chapters will deal with problem-solving techniques in the context 
of the material in this chapter. Here’s what we’ll look at first: 

• approximations, Taylor polynomials, and a Taylor approximation theo¬ 
rem; 

• how good our approximations are, and the full Taylor Theorem; 

參 the definition of power series; 

• the definition of Taylor series and Maclaurin series; and 

• convergence issues involving Taylor series. 


24.1 Approximations and Taylor Polynomials 


Here’s a nice fact: for any real number x, we have 

巧 i +a；+ 譬 + 譬. 


Also, the closer a: is to 0, the better the approximation. 

Let’s play around with this for a little bit. Start off with x = 0. Actually, 
both sides are then equal to 1, so the approximation is perfect! What about 
when x isn’t 0? Let’s try x = —1/10. The above equation says that 

10 十 2 6 , 


which simplifies to 

-1/10 g 

_ 6000. 

My calculator says that e -1 / 10 is equal to 0.9048374180 (to ten decimal 
places), while 5429/6000 is equal to 0.9048333333 (also to ten decimal places). 
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These numbers are pretty close to each other! In fact, the difference is only 
about 0.0000040847. 

How on earth did I come up with the polynomial 1 + a: + x 2 /2 + x 3 /6? 
It’s clearly not just any old polynomial; it’s specially related to e x . Rather 
than concentrate on e x itself, let’s get a little more general and consider other 
functions. Also, there’s nothing special about the degree 3 of our polynomial: 
we could have used any degree. So let’s start with degree 1 and see what 
happens. 

24.1.1 Lfniarizafion iievisifed 1 

Let’s say we have some function / which is very smooth, so that it can be 
repeatedly differentiated as many times as you like without causing any prob¬ 
lems. Here’s a question we asked back in Section 13.2 of Chapter 13: what is 
the equation of the line which best approximates the curve y = f(x) near the 
point (a, /(a))? The answer to this question is that the line we’re looking for 
is the tangent line to the curve at the point (a, /(a)), and its equation is 

y = f(a) + f'(a)(x - a). 

This is precisely the linearization of / at a; = a. The right-hand side is a 
polynomial of degree 1. In the picture below, the tangent line to the curve 
y = f(x) at a: = a is drawn in, and looks like a pretty lousy approximation to 
the whole curve: 



Nevertheless, it is a pretty good approximation to the curve near (a,/(a)). In 
fact, let’s zoom in near (a, /(a)): 
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Now you can see that there’s not that much difference between the tangent 
line and the curve y = f(x). The more we zoom in near x = a, the smaller 
this difference becomes. 

24.1.2 Quadratic approximations 

Why stick to lines, though? Let’s ask the same question we did at the begin¬ 
ning of the previous section, but with parabolas instead. Here is our question: 
what is the equation of the quadratic which best approximates the curve 
y = f(x) near (a,/(a))? Using the same function as in the picture above, 
here’s a guess as to what the quadratic should look like: 



It turns out that the formula for the quadratic which best approximates the 
curve y = f(x) for x near a (that is, near the point (a, /(a)) on the curve) is 
given by 

y = /(a) + f'(a){x -a) + a) 2 . 

This is actually a quadratic in x, because if you expand (x — a) 2 , the highest 
power of x floating around is x 2 . Still, it’s better to leave it in the above form 
and say that it’s a “quadratic in (: r — a).” Let’s call the quadratic P 2 ； that is, 
we set 

■P 2 ⑻ = /⑷ + /' ⑷卜 —a) + ^ ^ {x — a) 2 . 

Now, let’s gather some nice facts about P 2 ： 

1. Plug x = a into the above equation for P 2 ⑷ and you easily see that 
巧⑷ =/(a). So the values of P 2 and / match when x = a. In fact, 
since the zeroth derivative of a function is just the function itself, you 
might say that the values of the zeroth derivatives of P 2 and / match 
when x = a. 

2. Now differentiate P 2 to see that P^x) = /’ ⑷ + f r, {a)(x — a). Again, if 
you plug in x = a, you see that 巧 (a) = f(a). The values of the first 
derivatives P'^ and f also match when x = a. 

3. Differentiate once more to get = /" ⑷. When x = a, this becomes 
P^ip) = /"(a)，so even the values of the second derivatives match when 
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4. On the other hand, since /"(a) is constant, = 0 for all x. The 

same is true for all higher derivatives. (After all, P 2 is a quadratic, 
and the third and higher derivatives of any quadratic must be zero 
everywhere!) 

So P 2 shares the zeroth, first, and second derivatives with f at x = a; but the 
third and higher derivatives of P 2 are always 0. You might say that P 2 is the 
distillation of all the information about / at a; = a up to and including the 
second derivative. 

Here’s another nice fact about P 2 ： if you ignore the last term on the right- 
hand side of the above equation for P 2 (x), you just get /(a) + — a). 

This is exactly the linearization from the previous section. So you can think of 
the last term {a){x — a) 2 as a so-called second-order correction term. This 
means that we should actually be able to do a better job of approximation 
than just by using the tangent line. The second-order correction term helps 
us get even closer to the curve, at least for x near a. (An exception to this 
occurs when /’’ ⑷ = 0, in which case P 2 is actually the linearization and we 
haven’t gotten any closer.) 

24.1.3 Higher-da^f^^ipproximafiOftf 

Let’s continue the same pattern, except that we’ll use some arbitrary degree 
N instead of just 1 or 2. So, here’s our question: which polynomial of degree 
N or less gives the best approximation to f(x) for values of x near a? The 
answer is provided by the following theorem. 



In sigma notation, the formula looks like this: 



In this formula, remember that 0! = 1, that /⑼⑷ means the same thing as 
f(a) (zero derivatives), and that /⑴⑷ means the same thing as /’ ⑷ (one 
derivative). 

We call the polynomial Pn the Nth-order Taylor polynomial of f(x) at 
x = a. Note that the degree of Pn might be less than TV; for example, if 
/⑻⑷ = 0, then the last term in the above sum vanishes and the degree of 
Pn could be at most N —1. This is why we call it an ATth-order Taylor poly¬ 
nomial, not an iVth-degree Taylor polynomial. (By the way, the polynomial 
Pn(x) is sometimes written as Pn{oc] a) to emphasize that you get a different 
polynomial for each choice of N and a. I’ll just write Pjv(^), since we’re only 
dealing with one choice of a at a time anyway.) 
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Once again, the important property of Pn is that 

•a) = / ㈨ ⑷ 

for all n = 0, 1， ...，That is, the values of all the derivatives of Pn and 
f match when x = a, up to and including the ATth derivative; but all higher 
derivatives of Pn must be zero everywhere. The function Pn is the distillation 
of all the information about / which comes from its derivatives up to order 
N a,t x = a. 

Of course, when AT = 1, we just get Pi(x) = / ⑷ + f\a)(x — a), which 
is the linearization of / at x = a, and when AT = 2, we recover the formula 
for P 2 {x) from the previous section. Let’s see how this works for f(x) = e x , 
setting a = 0. By the above formula with N = 3 and a = 0, we have 

P 3 (x) = /(0) + f(0)x+ ^-X 2 + 

Luckily, all the derivatives of e x with respect to x are just e' so we can see 
that /(0), /'(0) ， /"(0)， and /⑶⑼ are all e 0 , which is equal to 1. Since 2! = 2 
and 3! = 6, the above formula becomes 

Ps{x) = 1 + a: + ^x 2 + ^x 3 . 

This is exactly the cubic polynomial from the beginning of Section 24.1 above! 
Of all degree three or lower polynomials, this one gives the best approximation 
for e x when x is near 0. Why 0? Because that’s the value we chose for a. If 
we chose a different value of a, we，d get a different polynomial which would 
approximate e x really well for x near a. By getting rid of the cubic term 
ar 3 /6, we can see that P 2 {x) — 1 x x 2 /2; then by throwing away the 
quadratic term x 2 /2, we get the linearization Pi(x) = l-\-x. Another way to 
look at this is that P 2 {x) improves upon Pi{x) by adding on the second-order 
correction term x 2 /2, while Ps(x) improves upon P 2 (x) by adding on a third- 
order correction term x 3 /6. Each time you increase N by 1, you are making 
the approximation better by adding on another correction term. 

The Taylor approximation theorem actually depends on Taylor’s Theorem, 
which we’ll look at in the next section. There’s something ambiguous about 
the statement of the approximation theorem, as well: what on earth does it 
mean to be the “best approximation,” anyway? We’ll explore this more in the 
next section, but the real answer is contained in Section A.7 of Appendix A, 
along with the proof of the theorem. 


24.1 M Taylor's Theorem 

In Section 24.1 above, we saw that 


e x ^ 


x 2 X s 


In particular, we noticed that when x = —1/10, the above approximation 
becomes 


1_ 1/100 1/1000 5429 

To 6 — = 6000' 
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How good is this approximation? One way to measure this is to consider the 
difference between the true quantity e _1 / 10 and the approximation 5429/6000. 
We’ll call this quantity the error in the approximation, since it shows how 
wrong we are to use our approximation instead of the true value. Here’s what 
the error is in our case: 

error = true value — approximate value = e _1 ^ 10 — 

叩 6000 

If the error is small, then the approximation is good. In Section 24.1, we saw 
that the difference was 0.0000040847 to 10 decimal places — but we needed 
to use a calculator, which defeats the entire purpose of doing our own ap¬ 
proximation. Remember that the number the calculator gives you is also an 
approximation! Besides, how do you think the calculator works? It probably 
finds its approximation of e -1 / 10 using a Taylor polynomial. 

What we’d really like is another formula for the error. That’s where Tay- 
lor’s theorem comes in. Rather than just specializing to the case of e x , let’s 
get more general once again. We’re dealing with a smooth function / and its 
ATth-order Taylor polynomial about x = a; as we saw in the previous section, 
this polynomial is given by 


PN(x) = J2^^-(x-ar. 

n=0 ' 

We want to use the value of Pn{x) to approximate the true value f(x), so we 
consider the error term, which is the difference between the true value and 
the approximate value: 


Rn(x) = f(x) - P N (x). 

Actually, Rn(x) is called the Nth- order error term; it’s also referred to as the 
Nth- order remainder term, since it’s all that remains when you take Pn(x) 
away from f(x). As promised above, Taylor’s Theorem gives an alternative 
formula for Rn(x): 


Taylor’s Theorem: the ATth-order remainder term Rn(x) about a: = a is 


RN ^ = WT §^- a ^ N+1 

where c is some number which lies between x and a. 


Note that the number c depends on what x and N are, and cannot be deter¬ 
mined in general! Since f(x) = Pn(x) + Rn(x), we can write the whole kit 
and caboodle as 




{N+l)\ 


This seems pretty nasty. And what on earth is with this number c, anyway? 
Actually, we’ve seen something like this before. Take a look back at our 
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discussion of Mean Value Theorem (MVT) in Section 11.3 of Chapter 11. 
The MVT says that if / is smooth enough on an interval [a, b], then there is 
a number c in [a, b] (which cannot be determined in general) such that 


f(c)= 


b — a 


If you replace b by x and solve for f(x), you get 


f(x) = f(a) + f(c)(x-a), 


where c is between a and x. Now let’s look back to the last equation in 
Taylor’s Theorem and put iV = 0. What is Po(x)? It’s just /(a). How about 
i?o ⑻？ According to Taylor’s Theorem, 

Ro(x) — ^ ^ C \ x - a ) 1 = f'(c)(x - a), 
where c is between x and a. Hey, so Taylor’s Theorem (with N = 0) says that 
f(x) = P 0 (x) + R 0 (x) = f(a) + f'(c)(x-a), 


which is exactly what the MVT says! So, Taylor’s Theorem is basically the 
Mean Value Theorem on steroids. By the way, the reason we say that c is 
between x and a instead of writing a < c< x is that x might actually be less 
than a, so then we would have x < c< a. 

Now let’s put N = 1 instead of N = 0. The main formula in the box above 
becomes 


f(x) = f(a) + f(a)(x - o) + - a ) 2 = L ( x ) + Ri(x )； 

here L(x) = f(a) + — a) is the linearization of / about x = a, and 

Ri(x) = \f n {c){x — a ) 2 is the first-order error term. This agrees with the 
formula for the error term r(x) which we gave in Section 13.2.4 of Chapter 13. 
We still have to go back to our approximation for e x . When we wrote 


产 = 


x 2 a : 3 


we now understand that this is just saying that e x = Ps (x) , where P 3 is the 
third-order Taylor polynomial of f(x) = e x about a: = 0. Taylor’s Theorem 
above says that Rs is then given by 


where c is between 0 and x. (I just plugged N = 3 and a = 0 into the formula 
for Rn(x) in the box above.) Since any derivative of e x (with respect to x) is 
e x , we know that /( 4 )(c) = e c ; also 4! = 24, so we actually have 
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In other words, 


e x = 


x 2 X s e c 4 

X+ Y + -6 + 24 X - 


We have changed our approximation into an exact equation, but we don’t 
know what c is! Still, we do get something very useful from this, because we 
know that c lies between 0 and x. For example, if you put x = —1/10 once 
again, you get 


which reduces to 


1/10 _ 5429 e c 
6000 240000 


This time, we know c lies between 0 and x = —1/10, so we actually have 
—1/10 < c < 0. Since e c is increasing in c, it’s clearly biggest when c is as big 
as possible; this means that c would have to be 0, and so e c can’t be bigger 
than e° = 1. So the error term is at most 1/240000. In other words, when 
we write e -1 / 10 = 5249/6000, we know that the approximation is accurate to 
better than 1/240000, which is about 0.0000041667. (Compare this with the 
actual value of the difference in Section 24.1 above.) 

We’ll look at some examples of using Taylor’s Theorem in Section 25.3 in 
the next chapter. Now it’s time to check out power and Taylor series. 


M-2 Power Series and Taylor Series 


Here’s another fact: 


x 2 x s x 4 x b 
1 + " + 2! + 3! + 4! + ¥ + ' 


for all real numbers x. You might notice that it looks similar to the approxima¬ 
tion at the beginning of Section 24.1 above, but there are two big differences. 
First, we’re no longer dealing with an approximation, and second, there’s an 
infinite series on the right-hand side! Whenever you have an infinite series, 
you’ve got to be careful. 

So, let’s see if we can understand what the above equation actually means. 
Suppose we start with the right-hand side, 


x 2 x 3 x 4 x 5 

¥ + ¥ + ¥ + 百 + ’ 


This looks like a polynomial, but it isn’t, since there’s no highest-degree term. 
It just keeps on going forever. In fact, it’s an example of a power series. If you 
replace x by any particular value, you get a regular old series. For example, 
if a; = —1/10, you get the series 


1 1/100 1/1000 1/10000 1/100000 

_ 10 + ^! 3I -+ — 4! 5! — + " 

which you could rewrite as 

i 1 _ 1 1 _ 1 1 
1 _ 10 + 100 x 2! _ 1000 x 3! + 10000 x 4! _ 100000 x 5! 
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I could give you a million more examples — actually, infinitely more. This 
single power series gives us information about infinitely many regular series, 
one for each value of x. By the way, it’s pretty obvious that the last series 
above converges to 1. There’s something special about setting a: = 0: it makes 
all the terms vanish except for the constant term. We’ll address this point 
soon; first, let’s look at general power series. 

24,2.1" 婷 ries in general 

A power series about a; = 0 is an expression of the form 
ao + a\x + a2X 2 + asx 3 -h a^x 4 + • • •, 

where the numbers a n are fixed constants. Even though a power series isn’t a 
polynomial, we’ll still refer to a n as the coefficient of x n in the power series. 
The above series can also be written using sigma notation as 


^ a n x n . 


In our example from the previous section, the series is 
. X 2 X s X 4 X 5 


which can be written in sigma notation as 


oo 1 


This series might converge, or it might diverge. So which is it : The answer is 
that it converges, and what’s more, we even know that it converges to e 一 " 10 . 
That’s the power of knowing that our above equation is valid for any real x: 


e x 


x 2 x 3 x 4 x 5 

- 1 - 1 - 1 - 

2! 3! 4! 5! 


It just means that if you plug any particular x into the right-hand side, you 
get a series which converges to the number e x . We’ll prove that this is actually 
true in Section 24.2.3 below; in the meantime, here are some more examples 
of what happens when you plug in a few values of x, one at a time: 



So this is a power series with coefficients given by a n = 1/n! for each non¬ 
negative integer n. Notice that x is the only true variable here; n is just a 
dummy variable which goes away when you actually expand the sum. An 













above one, written in expanded form and 


+ a : 4 + = f^x n . 

n=0 

ill equal to 1. Hopefully you can recognize 
term 1 and ratio x. 

;ion like 

22X 2 + asx s + a^x A H - 


what if a; > 1 or a: < — 1: The left-hand side makes sense, but the 
hand side doesn’t since the series diverges for these values of x. (Both 
are undefined if x is actually equal to 1.) 

•mething nice happens to the power series 

ao + a\x + a2X 2 4 - asx s + a^x 4 H - 

you set x = 0: all the terms vanish except for the ao at the beginning, 
j series automatically converges (to ao, of course!). This doesn’t tell us 
inff about whether the series converges for anv other value of x. For 
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Now 0 is a pretty funky number, admittedly, but it doesn’t need to be more 
special than the rest of the real numbers. Let’s transfer this special property 
over to some other number a. All we have to do is replace x by (x — a). So 
here is the general expression for a power series about x = a: 

\ao-\-ai(x — a)-\- a 2 (x — a) 2 + as{x — a) 3 + a^{x — a) 4 H - . 

In sigma notation, this looks like 


y^ j a n (x-a) n . 

n=0 


This series converges for sure when x = a, since all the terms except ao vanish. 
The number a is called the center of the power series. When would you want 
to consider a power series with a center other than 0? One example might be 
if you wanted to find a power series which converges to ln(a:). This quantity 
isn’t defined at a: = 0, so it would be silly to try to find a power series about 
x = 0 which converges to ln(x). On the other hand, we can find a power series 
with center 1 which converges to In (a:), at least for some values of x. Indeed, 
at the end of Section 26.2.1 of Chapter 26, we’ll see that the equation 




is valid for —1 < (a: — 1) < 1， that is, for 0 < x < 2. (It’s actually even true 
for x = 2: 


(_l) n — 1 

乙 n 



—=ln(2). 


This isn’t so easy to prove, however!) 


24.2.2 Taylor series and Maclaurin series 

In the previous section, we saw that a general power series about a; = a is 
given (using sigma notation and also in expanded form) by 

a n (x — a) n = ao + ai(x — a) + a 2 (x — a) 2 + as(x — a) 3 + a^{x — a) 4 H - . 

n=0 

This converges for x = a, and might converge for other values of x. In 
Section 26.1.2 of Chapter 26 ， we’ll look at some methods for finding which 
values of x make the series converge. We could then plug in all these values of 
x one at a time, find what the series converges to in each case, and call that 
f(x). So, starting with a power series, we have defined a function. 

Suppose that we instead start off with some smooth function /. We’re 
going to define a special power series about x = aby using all the derivatives 
of /: 


n=0 


(x-ar. 
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So, how do you know if and when a Taylor series actually converges to its 
underlying function? Start by writing 

f(x) = P N {x) + Rn(x), 

as we did in Section 24.1.4 above. Remember, 

Pn(x) = J2 …广 and 如 ㈤ = ^n + +^ x - 

This expresses f(x) as its approximate value Pn{x) plus the error, or remain¬ 
der, Rn(x). Now here’s the clever part: we let N get larger and larger. This 
should hopefully make the approximation Pn{x) get closer and closer to the 
true value f(x). This is the same thing as saying that hopefully the error 
Rn(x) gets smaller and smaller. 

Let’s try to write down some equations to describe all this. Suppose that 
for some x, we know that 

lim Rn{x) = 0. 

N-^-oo 

In the equation f(x) = Pn(x) + Rn(x), take limits as AT ^ oo: 

lim f(x) = lim Pn(x) + lim Rn(x) = lim Pn(x). 

N—kx) N-^-oo N-^-oo N-^-oo 

Since f(x) doesn’t depend on N, the left-hand side is just f(x), so we know 
that 


f(x) = lim PnM = lim ^ ■ 


So f(x) equals its Taylor series! In other words, if you want to prove that 
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for any x. We have to be a little careful about the e c factor, since c depends 
on N. The question is, how big can e c be? Remember that c is between 0 
and x. If x is negative, the biggest e c could be is if c = 0, which means that 
e c < 1. If x is positive, the biggest e c could be is if c = which means 
that e c < e x . In either case, since x is fixed (that is, treated as constant), 
we can write that 0 < e c < C, where C is another constant. This is true 
no matter what N is, even though c is wobbling around all over the place as 
N is changing. Anyway, hopefully you believe this, in which case you might 
believe that 

0. ki" +1 w +1 

- (AT + l)! - (AT + 1)!' 

Now the left-hand and right-hand sides go to 0 as TV tends to oo, so we can 
apply the sandwich principle to see that the middle quantity does too. So, 
we’ve proved that 

lim RnM = 0 

N—^oo 

for any real x. This means that we have finally proved that 


eL 


X s X 4 X 5 



for all real x. 

Let’s try to see everything in one self-contained example by finding the 
Maclaurin series of f(x) = cos ⑷ and showing that it converges to f(x) for 
all x. First, we need to differentiate / over and over again, then plug in 0 
for each derivative and see what happens. Well, when you differentiate cos(a:) 
with respect to x, over and over again, you get — sin(a:), then — cos(ar), then 
sin ⑷， then cos(a:), then — sin(x)，then — cos ⑻， and so on. Clearly this cycle 
will keep on going. When you plug in a: = 0, the sin ⑷ terms go away, and 
the 士 cos(a:) terms become 士 1. So the sequence of numbers /( n )(0) looks like 
this: 


1，0, 一1，0, 1，0, 一1， 0,1，0, 一1， 0, .. 


If you plug these numbers into the Maclaurin series formula 


all the odd-degree terms go away and you get 

1 - -x 2 + -x 4 - -x 6 + ■■■ 
2! +4! 6! + ， 

which you can rewrite more compactly as 

. X 2 X 4 X 6 


This is the Maclaurin series for cos (a:), or if you prefer, the Taylor series for 
cos(a:) about a: = 0. To get the corresponding Taylor polynomials, all you 
have to do is chop off the series at the right place. For example, 


Pa{x )= 




+ r 4 - 
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By the way, the formula for P 5 (a:) is the same as for P 4 (x )，since there’s no 
fifth-degree term in the above Maclaurin series. This is a good example of 
why we need the word “order” ： P 5 is of order 5, but degree only 4. 

Now, all that’s left is to prove that cos (a;) actually equals its Maclaurin 
series for all real x: 

个 2 r 4 r 6 


To do this, we need to show that 


We know that 


Rn{x )= 


lim Rn(x) = 0. 

f {N+1 Hc) M 


(AT+l)! ， 

where c is between 0 and x. Let’s take absolute values: 

剛| = 

All the derivatives of / are equal to either 士 cos ⑷ or 士 sin ⑷， so the quantity 
|/( iV+1 )(c)| is either |cos(c)| or |sin(c)|. In either case, this quantity is less than 
or equal to 1， so we have 


0< |i^iv(a：)| < 


(iV + 1)! 1 1 

Once again, we’ll show in the next section that 
t n+i 

lim ———— = 0. 

iv^-oo (iV + 1)! 

Now you can use the sandwich principle to show that 
lim \Rn(x)\ = 0, 

N-^-oo 

which means that also 

lim Rn(x) = 0. 

N^oo 

We have proved that 

^.4 ^,6 

⑽ ㈣“ | + + … 

for all real x. Let’s celebrate by expressing the above series in sigma notation. 
(What, isn’t that how you celebrate solving a tough problem?) Anyway, how 
do you get only even powers of xl The answer is to use 2n instead of n (see 
the end of Section 15.1 in Chapter 15 for a discussion of this sort of thing). 
Since the factorial on the bottom matches the degree, we might guess that 
the Maclaurin series can be written as 


E 


x 2 ™ 

Wy- 
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The problem is, this series doesn’t alternate. So insert a factor of (—l) n : 

… （_l)%2n 


E 


(2n)! 


n=0 

If you expand this, you’ll find that it works. Here’s a summary of what we 
found: 

(-l) n x 2n , a; 2 . x 4 a; 6 


cos(a;) = ^ . 


(2n)! 


for all real x. 


24.3 A-Useful Limit 


This section isn’t about power series at all —— it just contains a proof of a limit 
we needed twice in the previous section: 

x N+t 

^ooTnT ^. =0 


for any real number x. By letting n = N 1 (think of it like a substitution 
in an integral), this is exactly the same as showing that 


lim — p = 0 
n! 


for any real number x. There are several ways to prove this last statement, 
but here’s a sneaky way. Let me explain the logic I’m going to use, then 
actually do it. I’m going to prove that the series 


E 


converges, regardless of what x is. (Yes, we “know” that it actually converges 
to e x , but not until after we finish showing that the above limit is zero after 
all!) Anyway, it doesn’t matter what the series converges to; simply knowing 
that it converges is enough. Why? Because then the nth term x n /n\ must 
go to 0 as n goes to oo, or else the nth term test would fail. That is, if the 
terms didn’t go to 0 as n goes to oo, then the series would diverge. So lefs 
use the ratio test to show that the series converges for all x. Let’s fix x for 
once and for all and, with a n = x n jn\^ simply look at the limiting ratio: 

=lim 

Now we know that n!/(n + 1)! boils down to l/(n + 1)，so this last limit is 
lim \x\^— = 0, 

n-^oo n + 1 

since |尤| is fixed and l/(n + 1) goes to 0. The limit L is 0, which is less than 
1, so the series converges and we have, as a by-product, shown that the useful 
limit is correctly stated above. By the way, the technique of fixing x and then 
applying the ratio test to see whether the series converges for that particular 
x will be used many times in Section 26.1.2 of Chapter 26. 



x n+1 /(n + l)! 
x n /n\ 











CHAPTER 25 _ 

How to Solve Estimation Problems 


In the previous chapter, we showed how Taylor polynomials can be used to 
estimate (or approximate, if you prefer) certain quantities. We also saw that 
the remainder term could be used to get an idea of how good the approxima¬ 
tion actually is. In this chapter, we’ll develop these techniques and look an 
number of examples. So, here’s the plan for the chapter: 

• a review of the most important facts about Taylor polynomials and 
series; 

• how to find Taylor polynomials and series; 

• estimation problems; and 

• a different method for analyzing the error. 

26.1 Suraimory of Taylor Polynomials and Series 

Here are the most important facts about Taylor polynomials and series, all of 
which were developed in the previous chapter: 

1. Of all the polynomials of degree N or less, the one which best approxi¬ 
mates the smooth function / for x near a is called the ATth-order Taylor 
polynomial about x = a, and is given by 


Pn{x) = f{a) + f'ia){x - a) + - {x - a ) 2 






Using sigma notation, this can be written as 


Pn{x) = 


/ ㈨ ⑷ 


(x — a) n . 


2. The polynomial Pn has the same derivatives as / at x = a, up to and 
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including order N. That is, 

Pjv(a) = /(a), P]sr{ a ) = /'( a )， P。、 0 ) = /"( a )， Pf)( a ) = /( 3 )( a )， 

and so on up to P^\a) = f( N )(a). The above equations aren’t true in 
general if a is replaced by any other number, or for derivatives of order 
higher than N. (In fact, the derivatives of Pn of order higher than N 
are identically 0, since Pn is a polynomial of degree N.) 

3. The iVth-order remainder term Rn(x), otherwise known as the Nth- 
order error term, is simply the difference f(x) — Pn(x). It follows that 


I f{x) = P N (x) + R N (xj\ 
for any N • The remainder term is given by 


Rn(x)= 


/ ( JV + 1 ) ( C ) 

(AT + 1)! 


( 工 -a) 抑 1 


where c is some number between x and a which cannot be computed in 
general. 

4. So, the complete expression for f(x) is given by 


f( x ) = 


/ ㈨⑷ 


(jE-a)™ 


Jn+W 


(x — a)^ 1 . 


5. The infinite series 


E ㈣ 


is called the Taylor series of f(x) about x = a. For any particular x, this 
series may or may not converge. If for any particular x the remainder 
term Rjv(x) converges to 0 as iV —> oo, then we can write 


n=0 ' 

for that x. That is, f(x) is equal to its Taylor series representation 
(about x = a) ai the point x. 

6. In the special case where a = 0, the Taylor series is 

^ n! ' 

n=0 

This is called the Maclaurin series of f(x). So, whenever you see the 
words “Maclaurin series,” you can mentally replace them by “Taylor 
series about x = 0.” 
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25.2 Finding Taylor Polynomials and Series 

Suppose you want to find a certain Taylor polynomial or series. If you’re lucky, 
you can take a Taylor polynomial or series you already know, manipulate it, 
and get the polynomial or series you want. We’ll see some techniques of how 
to do this in Section 26.2 of the next chapter. Unfortunately, this doesn’t 
always work: sometimes, you need to bust out the formula for the Taylor 
series of / about x = a from the above summary: 


E 




Knowing the number a and the function /, you have to find the values of all 
the derivatives of /, evaluated at a: = a, and then plug them into the above 
formula. This can be a real pain in the butt, however! Differentiating once or 
twice is bad enough, but differentiating hundreds and thousands of times is 
ridiculous. Things aren’t so bad if you only want to find a Taylor polynomial 
of low degree, since then you only have to calculate a few derivatives. We’ll 
also see some nice tricks in Section 26.2 that can help you avoid the above 
formula altogether, if you’re lucky. 

On the other hand, some functions are really easy to differentiate. One 
such example is the function / defined by f(x) = e x ; we looked at the Maclau- 
rin series of this function in the previous chapter. What if you don’t want the 
Maclaurin series of f, but instead you want the Taylor series about x = —2? 
Well, put a = —2 instead of 0 in the above formula to see that we are looking 
for 


We need the values of /( n )(—2) for many values of n, so it’s really helpful to 
set up a table of derivatives. In general, the template should look like this: 



The middle column of this table should be filled in first. Start off with the 
function itself in the top row, then just keep differentiating. Each time you 
differentiate, put the result in the next row of the table (still in the middle 
column). When the middle column is all filled in, substitute x = a into each 
entry in the middle column and enter the value in the same row in the third 
column. Note that you may have to use more rows — it depends how big n is, 
or how soon you can work out the pattern. In our example, a = —2 and all 
of the derivatives of f(x) are e x , so the filled-in table looks like this: 
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Of course, to find the fourth-order Taylor polynomial Pa{x) (still about the 

center x = 7t/ 6), just drop the “H -” at the end. If you only want Ps(x), 

then also drop the last term, so that the final power is 3: 


Ps(x)= 





(Now we replaced 2! by 2 and 3! by 6.) On the other hand, if you actually 
wanted Pd(x), you’d have to add another row at the end of the above table 
corresponding to n = 5, so that you get the extra term in (x — tt/6) 5 that you 
need. 

One more example: what is the Maclaurin series of (1 + a:) 1 〆 2 ? Since we 
want a Maclaurin series, we need to set a = 0. Let’s draw up a table of 
derivatives up to fourth order: 


n 

f n \x) 

/㈨ ⑼ 

0 

(1 + ^/2 

1 

1 

|(1 +a;)- 1 / 2 

1/2 

2 


-1/4 

3 

1(1+ X)-5/2 

3/8 

4 

- 藉 (1 + 矿 

-15/16 


Now, let’s write down the general formula for the Maclaurin series, 

/ ⑼會孕 2 + ^3 + ^ W ..， 

then plug in the numbers for the derivatives from the above table to get 

! + ^ + ^ 2 + 3 / 8^3 + 




Let’s simplify this as 


x 2 X s 5x 4 


In fact, it turns out that the remainder term goes to 0 when x is between —1 
and 1 (this is tricky to prove!), so we actually have 


(l + a ：) 1/2 = l 



x 2 X 3 

- + - 

8 16 


5a; 4 
128 + 


when —1 < a; < 1. This is a special case of the binomial theorem, which says 
that 


(l + xr = l+ax+ ^l) x 2 + <a-l)(a-2 ) x3 

a(a — l)(a — 2)(a 一 3) 4 

+ - 4! - 

for —1 < x < 1. The series on the right-hand side diverges when x > 1 or 
x < —1 unless a happens to be a nonnegative integer. (In that case, the 
right-hand side is actually a polynomial. Can you see why?) 
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25.3 Estimation Problems Using the Error Term 


In Section 24.1.4 of the previous chapter, we used a third-order Taylor poly¬ 
nomial P 3 to estimate e— 1 / 10 ; then we used the remainder term R 3 to get 
an idea of how good our approximation was. Let’s revisit these methods and 
generalize them. 

To set the scene，consider the following two similar examples: 

1. Estimate e 1 〆 3 using a Taylor polynomial of order 2, and also estimate 
the error. 

2. Estimate e 1 / 3 with an error no more than 1/10000. 



The second problem is more difficult than the first one. You see, in the first 
problem, we know that we’re dealing with a Taylor polynomial of order 2, so 
we can set AT = 2 in our formulas. In the second problem, we actually have 
to find N, which is one more thing to worry about. 

With these two types of problems in mind, check out the general method 
for solving estimation (or approximation) problems: 


1. Look at what you want to estimate, and pick a relevant function /. In 
our examples above, we want to estimate e something , so set f(x) = e x . 
Later on, we will set x = 1/3, since /(1/3) = e 1 ’ 3 , the quantity we want 
to estimate. 

2. Pick a number a which is pretty close to this value of x, and so that f(a) 
is really nice. This means that you should be able to write down f(a) 
exactly, as well as /’ ⑷， /’’ ⑷， and so on. In our example, we’ll put 
a = 0, since that’s pretty close to 1/3 and also e° is easy to compute. 

3. Make a table of derivatives of /, just like we did in the previous section. 
It should have three columns which show the values of n, /( n )($)，and 
/( n )(a). If you know the order of the Taylor polynomial to use, that’s 
the value of N you’ll need; make sure to go up to the (AT+l)th derivative 
in the table. Otherwise, just write down as many rows as you can be 
bothered to; you can always fill in more later if you need to. 

4. If you don’t care about the error in your estimate, skip to step 8 . Oth¬ 
erwise, write down the formula for Rn(x): 


Rn(x)= 


/( 脚％) 

(AT + 1)! 


(x- a) N+1 


making sure to write “c is between a and xP As you’re writing, replace 
a by its true value on the fly, including in your comment about c. 

5. If you know the order of the Taylor polynomial to use, replace N by this 
number in the above formula. If not, make an educated guess based on 
how small you need the error to be. The smaller, the higher N should 
be. For many problems, TV = 2 or 3 will do nicely. If you’re wrong, 
you’ll know soon enough; you’ll just have to repeat this step and the 
next two steps with a higher value of N • 

6 . Now, replace x by the value you want in the formula for Rn (a:). No 
unknown variables should be left except for c, and you should write 
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down the possible range of c as an inequality. In our case, with a = 0 
and x = 1/3, we know that c lies in between, so we’d write 0 < c < 1/3. 

7. Find the maximum value of |i?jv ⑷ |, where c lies in the appropriate 
interval. This is how big the error can possibly be. If you know the 
value of AT, you’re all done with the error estimate. If not, compare the 
actual error with the one you want. If your actual error is smaller, that’s 
great — you have found a good value of N. Otherwise, you’re a little bit 
screwed ― you have to go back to step 5 and try again. (We’ll look at 
some techniques for maximizing |i?iv(^)| in Section 25.3.6 below.) 

8 . Finally, it’s time to find the actual estimate! Write down the formula 
for Pn(x): 

Pn{xI^ f{a) + f{a){x - a) + ^^-(x - a) 2 

+ ^M {x -af + ... + ^ {x -ar. 

Now replace a and N by the values from above to get a formula in terms 
of x alone. Finally, write down the approximation 

[7(^) = Pn{x) I 

and plug in the actual value of x that you need. The left-hand side will 
be the quantity you want, and the right-hand side will be the approxi¬ 
mation. 

9. One other piece of information is available if you want it: if Rn(x) is 
positive, your estimate is an underestimate; if Rn{x) if negative, the 
estimate is an overestimate. These facts follow from the equation 

I fix) = Pn(x) + Rn(x). I 

Now, let’s look at five examples of these types of problems. 


;SS. 3,1 霞輯 ©sample 



We’d better start with the two questions from the previous section. In the 
first problem, we want to estimate e 1 〆 3 by using a second-order Taylor poly¬ 
nomial. This is actually quite similar to our example involving e -1 / 10 from 
Section 24.1.4 of the previous chapter. Anyway, let’s follow the above method. 
We start by picking /; since we’re exponentiating, let’s set f(x) = e x and note 
that our quantity e 1 〆 3 is just /(1/3). Eventually, we’ll put x = 1/3, but not 
yet. We also need to pick a close to 1/3 so that e a is nice; as I mentioned, 0 
is a natural choice. 

Now, it’s time for a table of derivatives: 


n 

/(") ⑷ 

/(") ⑼ 

0 

e x 

1 

1 

e x 

1 

2 

e x 

1 

3 

e x 

1 
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I went up to 3, since that’s one more than 2, and we need a second-order 
Taylor polynomial (that is, N = 2). OK，so moving on, the error term is 


Rn(x )= 


/^ + 1 )( c )^ +1 

(N+l)\ 


where c is between 0 and x. Notice that I replaced a by 0 in the standard 
formula for Rn(x). Now, we know that AT = 2, so we actually need 


丑 2 ⑷ 


= ^ = ^3. 


I read /( 3 )(c) = e c from the last row of the middle column in the table above, 
replacing x by c. Now, let’s replace a: by 1/3 to see that 


丑 2(1/3) = *(1/3)3 = 


e c 

162 ； 


here c is between 0 and x = 1/3, so 0 < c < 1/3. Let’s take absolute values: 


I 丑 2(1/3)| 


162 


since e c must be positive. Next, we need to maximize |i?2(l/3)|. Since e c is 
increasing in c, the largest value occurs when c = 1/3. This shows that 


|_R 2 (1/3)| = 


e c e" 3 
162 162 


We seem to have a problem, since we don’t know what e 1 〆 3 is. That’s actually 
the whole point of the question in the first place! Never mind, let’s make a 
gross overestimate for e 1 ’ 3 . You see, e < 8, so e 1 ’ 3 < 8 1 / 3 which is just 2. 
Why did I choose 8? Because I can take the cube root of it without thinking 
too much! Anyway, using the inequality e 1 〆 3 < 2, the above inequality for 
|i?2(l/3)| becomes 


W/3)l = = 


So the error is no more than 1/81. We still need to find the estimate. Let’s 
write down the formula for P 2 {x), using the fact that a = 0: 

巧⑷ =/(0) + f\0)x + )$ 2 . 

From the above table, we can replace all of /(0), / 7 (0), and /"(0) by 1: 

P 2 ⑻ =1 + $ + ^x 2 . 


Finally, put x = 1/3 to get 


P 2 (l/3) = 1 + 



25 

18* 
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Since f(x)= 巧 ⑻， we have 

/(l/3)^P 2 (l/3). 

Using the fact that f(x) = e x ，we see that 

e" 3 =/(1/3) Kl/3) 


We have already shown that |i?2(l/3)| < 1/81, so our estimate is accurate 
to at least 1/81. In fact, since i?2(l/3) is positive, our estimate 25/18 is an 
underestimate to the true value e" 3 . 


25.3.2 Second example 

@ Now we’ll do the second example from Section 25.3 above: estimate e 1 〆 3 
with an error less than 1/10000. Just as in the previous example, we’ll set 
f(x) = e x , a = 0, and eventually we’ll put x = 1/3. Once again, we have 


where c is between 0 and x. We already know from the previous example that 
N = 2 won’t work, since we got a maximum error of 1/81 and we need the 
error to be less than 1/10000. So, let’s see if AT = 3 will work. The error term 


where c is between 0 and x. Put x = 1/3 to get 


24 V3 ； 24 x 81 7 

where 0 < c < 1/3. Again, we can use the fact from the previous section that 
e c < 2 if c is between 0 and 1/3: 

㈣ 1 ’ 3 ) 卜 24 x 81 < 24 x 81 = 972' 

This is not less than 1/10000, so iV = 3 is not big enough. Let’s try AT = 4. 
Repeating the above steps, we have 


x = 1/3, we see that 


120 V 3 / 120 x 243 


Again c is between 0 and 1/3, and again e c < 2 there, so 


120 x 243 14580 











544 • How to Solve Estimation Problems 


(If you think you need a calculator to work out the last fraction, think again — 
you can reduce 2/120 down to 1/60, then work out 6 x 243, multiply it by 
10, and stick it in the denominator.) In any case, we know that |i?4(l/3)| is 
plenty less than 1/10000, so we’re golden: we can take iV = 4. So what is the 
estimate? We need to find 巧 (1/3). In general, when a = 0, the fourth-order 
Taylor polynomial P4 is given by 


_P 4 ⑷ = l+o: + 


o : 2 

¥ 


X s x 4 

- 1 - 

3! 4 ! ， 


so 


(i)= 1+ i + mi + mi + mi = 1+ i+i+ 丄 + 丄 ：— 

\3 ) 3 2 6 24 3 18 162 1944 1944 


That is, 

e 1 /3 = /(i/ 3 )^P 4 (l/3) = gg. 

So, we can replace our estimate 25/18 from the previous example by a much 
better estimate, namely 2713/1944. This new estimate is guaranteed to be 
within 1/10000 of the true value e 1 ’ 3 . As a test, I did use a calculator to 
see that 2713/1944 is 1.39558 to five decimal places, whereas e 1 / 3 is 1.39561 
to five decimal places. These quantities are therefore at most 0.00004 apart, 
which is well within the allowed tolerance of 1/10000 = 0.0001. 


25.3.3 Third example 



Here’s a question: estimate y/27 with an error of no more than 1/250. Ac¬ 
cording to the above method, we have to select an appropriate function / and 
values of a andx. A good choice of the function would be given by f(x) = y/x, 
or if you prefer, f(x) = x 1 ^ 2 . Then we want to estimate /(27) = y/27, so even¬ 
tually we’ll set x = 27. Now we need a number close to 27 that we can easily 
take the square root of. It seems as if 25 is pretty good, so let’s take a = 25. 
That takes care of the first step. Moving on to step 2, let’s draw up a table 
of derivatives: 


n 

/(") ⑷ 

/(，） 

0 

X 1 / 2 

5 

1 

\x-^ 

1/10 

2 

-K 3/2 

-1/500 

3 


3/8 x 1/5 5 


Remember, to fill in this table, we put the entry a: 1 〆 2 in the top row of the 
middle column, and then differentiated a few times, putting the results in each 
successive row in the middle column. Finally, the entries in the right-hand 
column come from substituting the value a = 25. The difficulty is that we 
don’t know how much of this table we need. Perhaps we’ll even need more 
rows. 

So let’s look at the error term, which is given by 

Rn ^ = 1 ^t§^- 2 ^ n+ ^ 
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25.3.4 Fourth example 



To see just what we’re up against here, let’s suppose that we change the 
previous question slightly. Instead of let’s say we want to estimate 
within a tolerance of 1/250. This can’t be too different from the previous 
example, right? Well, not quite. Let’s see what happens. We’re still going to 
use the Taylor series for f(x) = x 1 ^ with a = 25, but now we have to put 
x = 23 instead of x = 27. Let’s see what happens with the remainder term 
Ri which worked so well in the previous example: 


I 丑 i(23)| = |^(23- 25) 2 | = \~\c~ 3/2 x i x (-2) 2 

This is exactly what the error term was in the previous example! There is an 
important difference: now c is between 23 and 25. So, how big can |c -3 / 2 
be? Well, again this quantity is decreasing in c, so it’s biggest when c is as 
small as possible, namely when c = 23. This leads to the following estimate: 



3/2 23 -3/2 

|-Ri(23)| = —-— < —-— • 


Unfortunately, 23 一 3 / 2 isn’t as easy to compute as 25 一 3 / 2 . The one thing we 
can be sure of is that this isn’t good enough. You see, | - 25 - 3 / 2 = 1/250, but 
I • 23 _3 / 2 is bigger than this, so it’s too big. So TV = 1 isn’t going to fly; we 
have to try N = 2. 

OK, so taking N = 2 and using the table on page 544 above, we have 


I 免 (23)| 


/ (3) (c) 

3! 


(23 - 25) 3 


x (-2) 3 




where 23 < c < 25. Once again, c 一 5 〆 2 is biggest when c = 23, so we have 


r—5/2 — 5/2 

I 也 (23)| = ^-< ^ 2 -. 


Is this good enough? Not having a calculator available, we have to come up 
with some way of estimating 23 一 5 〆 2 . Man, how are we going to do that? 
The best way I can think of is to come up with a number that is less than 
23 that we can easily raise to the power —5/2. That would be 16. Now 
16 一 5 / 2 = 1/4 5 = 1/1024, so 


23 一 5 〆 2 16 一 5 / 2 

|i? 2 (23)| < < 


This is certainly smaller than 1/250, so taking N = 2 works and we can use 
P 2 (23). Now 


刪 , /(25_'/'(25)0r - 25) + 

= 5 '^J0 {X ~ 25) ~506V2 {X ~ 25)2 
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(using the table once more), so replacing x by 23, we have 

2 4 


1199 

~2h0' 


So our estimate for y/23 is 1199/250. Now, my calculator says that this last 
fraction is equal to 4.796 exactly, whereas it says that V^23 is about 4.79583. 
These two numbers are indeed within 1/250 of each other. 


25.3.5 Fifth example 
◎ 


Let’s look at one more example: estimate cos(7r/3 — 0.01) using a third-order 
Taylor series, and determine how good the estimate is. Well, we need to 
choose a function; the obvious one is given by f(x) = cos(x)，so we’ll need to 
put x = 7r/3 — 0.01 in the end. What’s a number close to this value of x that 
we can easily take the cosine of? It seems like a = 7r/3 is a natural candidate. 
So we set up a table as follows: 


n 

f^{x) 

/( ， /3) 

0 

1 

2 

3 

4 

cos(x) 

— sin ⑷ 

— cos(rr) 

sin(a:) 

cos(a:) 

1/2 

-\/3/2 

-1/2 

V3/2 

not needed 


The error term Rs(x) is given by 




HnH) 4 , 


where c is between x and 7r/3. Notice that we need /( 4 ) (c), not /( 4 ) ( 丌 /3); that 
explains the use of “not needed” in the above table. Now, when x = 7r/3—0.01, 
we have 




o.oi 


cos(c) 


(- 0 . 01) 4 


cos(c) 

24 x 10 8 • 


(Here we have used (—0.01) 4 = (0.01) 4 = (10 一 2 ) 4 = 10 一 8 .) Now we just need 
to estimate the absolute value of this error term; since |cos(c)| < 1, we see 
that 

|i?3 (|-0.0l)| = ^58 = 2400000000- 

Great — we know that using Ps(7t/3 — 0.01) to estimate cos(7r/3 — 0.01) will be 
accurate to within the tiny number 1/2400000000. So what is _P3(7r/3 —0.01)? 
In general, 

w = / ㈤ +/' ㈤ 卜 - »(I) 卜 -0+1’ 轉 - 訂 • 

Using the above table of derivatives, this becomes 


巧⑷ = 全 - f 卜—吾 ) 4 x 备 (卜 f ) 2 +• x f (卜吾 ) 3 
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Put x = 7r/3 — 0.01 and simplify ； the result is 

巧 (._ 0.01) = ■ _ ^(-0.01)- J(-o.oi ) 2 + ^l(-o.oi ) 3 

_ 1 \/3 1 Vs 

= 2 + 200 _ 40000 _ 12000000. 

This might seem like a nasty expression, but it’s really not too bad. The 
only tricky quantity is \/3, but that’s pretty easy to estimate by itself. At 
least there are no trig functions to deal with. Anyway, since /( 丌 /3 — 0.01) is 
approximately equal to Ps(tt/3 — 0.01), we have 

cos (- - 0.01) = f - 0.01) = - + — - — —， 

V3 ) J V3 ) 2 200 40000 12000000 ， 

accurate to within 1/2400000000. 

25.3.6 General techniques for estimating the error term 

In all the above examples, we had to estimate the quantity |/( iV+1 )(c)| for c 
in some given range. Here are some general tips for doing this: 

1. Regardless of the value of c, you can always use the standard inequalities 
|sin(c)| < 1 and |cos(c)| < 1. 

2. If the function /(# +1 ) is increasing, then its value is biggest at the right- 

hand endpoint. In the first two examples above, we needed to find the 
largest value of e c , where 0 < c < 1/3. Since e c is increasing in c, 

we can say that e c < e 1 〆 3 . On the other hand, in the example from 

Section 24.1.4 of the previous chapter, we also needed to maximize e c , 
but this time —1/10 < c < 0. Again, since e c is increasing in c, this 
maximum value is just e° = 1. That is, e c < e° = 1. 

3. If the function /(*^ +1 ) is decreasing, then the greatest value of f^ N+1 \c) 
occurs at the left-hand endpoint of the interval. For example, if you know 
that c is between 1 and 5, then the greatest value of 1/(3 + c) 4 occurs at 
the left-hand endpoint of the interval [1,5], since 1/(3+ c) 4 is decreasing 
in c. So the above expression is biggest when c = 1, and its value then 
is 1/4 4 = 1/256. 

4. In general, you might have to find the critical points of the function 
/( 科 1 ) in order to maximize it. (See Section 11.1.1 of Chapter 11 for a 
reminder on how to do this.) 

2§.4 Anotfi 激 Technique for Estimating the Error 

Cast your mind back to the alternating series test (see Section 22.5.4 in Chap¬ 
ter 22). This test says that if a series is alternating, and has terms whose 
absolute values are decreasing to 0, then the series converges. The reason this 
is true is that the partial sums form a sort of yoyo about the actual limit: one 
is bigger, the next one is smaller, the next is bigger, and so on. Each time, the 
partial sums do get closer to the actual limit, so the yoyo is losing steam. The 
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idea is that at each point in the series, adding the next term overshoots the 
actual value, so the entire error is less than the next term in absolute value. 

Let’s see what this looks like in symbols. Suppose you start off with some 
function /, and find its Taylor series about x = a. If you also happen to know 
that the series converges to f(x) for some particular value of x (as it often 
does for the sorts of functions we look at), then you can write 




For the particular value of x that you’re interested in, if the above series is 
alternating with terms whose absolute values decrease to 0, then the error is 
less than the next term. That is, 

I 她 I 口 (一 1. 

There’s no nasty c to worry about, which is more than enough reason to 
use this nice fact. Remember, it only works if the series satisfies the three 
conditions for the alternating series test! 

Here’s an example of where this method really shines. Suppose we’d like 
to use a Maclaurin series to find an estimate for the definite integral 



with an error no greater than 1/3000. By the way, this looks like an improper 
integral, with problem spot at 亡 = 0, but actually 亡 = 0 isn’t a problem spot 
at all. You see, by PHopitaPs Rule, 

1 - cos ⑷四 sin ⑷ —1 

t^o t 2 ~ 2t ~ 2* 

That is, the integrand doesn’t blow up at t = 0 after all, so the integral isn’t 
improper. Anyway, that’s just an observation. Now we have to solve the 
problem. 

The first useful idea is that we can form a function that looks something 
like the integral by setting 



dt. 


The integral we want to estimate is then /(l). We need to find the Maclaurin 
series for /. To do this, replace cos ( 亡 ) by its Maclaurin series, which we found 
in Section 24.2.3 of the previous chapter. That is, 




t 2 t 4 t 6 t 8 
2 ! + 4 ! ~ 6 ! + 8 ! 




dt. 


If you simplify things a little, you should be able to reduce this to 


fix)= 



e t A 

4! + 6! 
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Now do the integration and evaluate at the endpoints: 



/ ⑻ 


_ (±_ t 3 t 5 _ t 7 

= V2! _ 3x4! + 5x6! _ 7x8! 

X X 3 X 5 X 7 

= 2! _ 3x4! + 5x6! _ 7x8! + 


By the way, it’s a good exercise to try writing this series in sigma notation. 
Anyway, we can now put in x = 1 to see that 


/(i) = 


— cos ⑷ 

~~i 2 ~ 


5x6! 


x 8! 


Truth be told, I pulled a couple of fast ones on you here. First, I replaced 
cos(t) by its Maclaurin series. Well, that’s ok ― we’ve seen in Section 24.2.3 
of the previous chapter that we can do this for all t. Second, I integrated an 
infinite series term by term and claimed that the new series converges to / 
for all x. We’ll see in Section 26.2.3 of the next chapter that this sort of thing 
is valid (although we won’t prove it). Anyway, the above equation is correct; 
we now have an exact expression for our integral in terms of an infinite series. 

Now the only question is, how many terms do we have to take to get an 
approximation that is within 1/3000 of the true value? Well, notice that the 
series is alternating and that its terms are decreasing and go to 0. So we can 
use the idea that the absolute value of the next term is bigger than the error. 
For example, if you approximate the integral by the first term 1/2!, the error 
is no bigger than 1/3x4!，which equals 1/72. That is much too big. How 
about if you approximate the integral using the first two terms? That is, what 
if you use 


— cos ⑷ 

~~t 2 ~ 


dt = 


1 1 _ 35 9 

2!~374! = 72 - 


Then the error is less than the absolute value of the next term: 


I error I < 


5x6! 5 x 720 



This is less than our tolerance of 1/3000, so it’s all good. We can safely 
say that the integral is approximately equal to 35/72, with an error less than 
1/3000. (We can even tell that 35/72 is an underestimate. Why?) By the way, 
I tried the integral on a computer program that can handle such things and 
it told me that the value of the integral is approximately 0.486385, whereas 
my calculator says that 35/72 equals 0.486111 (to six decimal places); these 
two numbers are indeed within 1/3000 of each other. 

Now as an exercise, you should try approximating 



dt 


within a tolerance of 1/1000, using the same method as above. (You’ll need 
the Maclaurin series for sin(^), which you can find in Section 26.2 of the next 
chapter.) 










CHAPTER 26 


Taylor and Power Series: How to Solve Problems 


In this chapter, we’ll look at how to solve four different classes of problems 
involving Taylor series, Taylor polynomials and power series: 

• how to find where power series converge or diverge; 

• how to manipulate Taylor series to get other Taylor series or Taylor 
polynomials; 

• using Taylor series or Taylor polynomials to find derivatives; and 

• using Maclaurin series to find limits. 

.2-6.1 Convergence of Power Series 

Let’s say we have a power series about x = a: 

y^a n (x-a) n . 

n=0 

As we saw in the case of geometric series, a power series might converge for 
some x and diverge for other x. The question that we want to ask is this: 
given our power series, for which x does it converge, and for which x does 
it diverge? Furthermore, if the series converges for a specific 工 ， it would be 
nice to know whether the convergence is absolute or merely conditional. So, 
let’s see what could possibly happen, and then we’ll take advantage of these 
observations. 

26.1.1 Radius of convergence 

We want to find out for which x the power series a n(x — a) n converges. 
On the face of it, it seems like we have to answer infinitely many questions 
here, since there are infinitely many values of x to substitute in and test to see 
whether the series converges or not. Let’s draw a number line representing 
different values of x. For each x such that our power series converges, we’ll 
put a check mark above it, whereas if the power series diverges for a particular 
x, we’ll put a cross instead. (Of course, we won’t do this for every single x, 









552 • Taylor and Power Series: How to Solve Problems 


since the diagram would get crowded! We’ll just do enough to get the idea.) 
For example, the geometric series xU converges when —1 < a: < 1 and 
diverges otherwise, so its picture looks like this: 


XXXX XXXXXXX 


vyyyyyyyyyyyy/ 


yyyyyj 


XXXX X XXXXXXX 


Note that I took special care to indicate the divergence at the endpoints 1 
and —1. 

On the other hand, we’ve seen that the series 

oo „ 

乙 n! 

n=0 

converges for all x (to e x , of course), so its picture looks like this: 

yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy 

._L__^ 

0 

It seems like this could be pretty unpredictable. One thing that we can say 
for sure is that the power series always converges at a: = a. In fact, if you 
substitute x = a into 

a n (x — a) n = ao + ai(x — a) + a 2 (x — a) 2 H - , 

n=0 

you can see that all the terms vanish except ao- So, the series evidently 
converges (to ao). Unfortunately, the value a: = a is the only value for which 
we can predict the convergence for certain. How about the other values? 
Maybe it would be possible to get a hodgepodge of checks and crosses, like 
this: 

yyyyyy xxxyxyxxxx x yxx yxxxx xyyxxyxxxx 

a 


It turns out that the above picture can’t happen for power series. Specifically, 
there are only three possibilities that can occur: 

1. There is some number 丑 > 0， called the radius of convergence of the 
power series, such that the picture looks like this: 


XXXXXXXXXXX hjHJJMJJJJ 




XXXXX XXXXXXX 


a — R 


a R 
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The explanation of this diagram is as follows: 

• The power series converges absolutely in the region |a: —a| < R (you 
can write this condition a,s a — R < x < a-\-Rii you prefer), so there 
are check marks there. 

• The power series diverges in the region \x — a\ > R (you can write 
this a,sx<a—Rorx>a-\- R), so there are crosses there. 

• At the two specific points where \x — a\ = R, (that is, at x = a R 
and x = a — R), the power series might converge absolutely, con¬ 
verge conditionally, or diverge. You have to check both these points 
separately to see what happens there, so there are question marks 
at these two points in the above diagram. I’ll refer to these points 
as the “endpoints.” 

An example of this is the geometric series xU ' This is a power 
series with a = 0 which converges absolutely when \x\ < 1 and diverges 
otherwise. The radius of convergence is therefore equal to 1, and the 
series diverges at the endpoints 1 and —1. 

2. The power series might converge absolutely for all x, in which case the 
diagram looks like this: 

jj yyyyyyyyyyyyyyyyyyy jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj 

a 

In this case, we say that the radius of convergence is oo. As we saw 
above, an example of this is the power series for e x , 

x n 


Other examples include the Maclaurin series for sin(x) and cos(x). 

3. The power series might converge absolutely only for x = a and diverge 
for all other x. In this case, the radius of convergence is 0. We’ll soon 
see that this is the case for the series 

n=0 

for example. The picture for this case looks like this: 

XXXX XXXXXXXXXX XXX 7 XXXXXXXXXXXXXXXXXXX 

a 


Of course, I haven’t said why these are the only possibilities. This should 
become clear very soon! 
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How to find the radius and region of convergence 

Given a power series, how do we find the radius of convergence? The answer 
is to use the ratio test. Sometimes the root test will be more effective, but 
for most problems the ratio test is better. (See Sections 23.3 and 23.4 in 
Chapter 23 for more about the ratio and root tests, respectively.) Here’s the 
general approach: 

1. Write down the limiting absolute ratio; this should always look like 


a n+1 (x - a) n+1 

s lim a ™ +1 

a n (x — a) n 

n-^-oo a n 


If instead you use the root test, you should get 
lim \a n (x - a) n \ 1/n = lim 


2. Work out the limit. It’s important to note that the limit is as n —> oo, 
not a: ^ oo. There’s a big difference! Regardless of whether you use the 
ratio test or the root test, the answer should be of the form L\x — a|, 
where L might be a finite number, 0, or even oo. The important point 
is that there is a factor of \x — a\ present. 

3. In either the ratio test or the root test, the important thing is whether 
the limit L\x — a\ is less than 1, greater than 1, or equal to 1. So, if L 
is positive, then divide by L to understand everything: if \x — a\ < 1/L, 
the power series converges absolutely; if \x — a\ > 1/L, then the power 
series diverges; whereas if \x — a\ = 1/L, then we can’t tell and need to 
check the two endpoints. We are in the first situation from the previous 
section, and the radius of convergence is 1/L. 

4. If L = 0, then the limiting ratio is always 0 regardless of the value of 
x. Since 0 < 1, this means that the power series converges absolutely 
for all x, so we are in the second case from the previous section and the 
radius of convergence is oo. 

5. If L = oo, then it looks like the power series never converges. In fact, 
the series must converge when x = a, but it will diverge for every other 
x and so we are in the third case from the previous section: the radius 
of convergence is 0. 



This more or less shows why we must get one of the three cases of the previous 
section. It’s still pretty abstract, though — we need to illustrate this with a 
whole bunch of examples. 

First, consider the power series 



Let’s use the ratio test. We start off by taking the standard term x n /n ln(n) 
and putting it in the denominator of a big fraction; then to get the numerator 
of our fraction, start with the standard term x n /n ln(n) again, but this time 
replace each occurrence of n by (n + 1). Finally, take absolute values, then 
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limits as n —> oo. So we are looking for 


f+i 


lim 


(n + 1) ln(n + 1) 



This can be dealt with in the same way as ratio test problems for plain vanilla 
series: just group together like terms. You get 

I x n+1 I 


lim 


(n + 1) ln(n + 1) 



=lim 


x n+1 n ln(n) 
x n n + 1 ln(n + 1) 


=lim |a;|- 


ln(n) 


1 ln(n + 1) 


=ki- 



Again, the limit is as n ^ oo, which is why we replaced n/(n + 1) and 
ln(n)/ ln(n + 1) by 1. (Use PHopitaPs Rule to deal with the logarithms; I’ll 
leave the details to you.) Anyway, the limiting ratio is \x\, so by the ratio 
test, our power series converges absolutely when \x\ < 1 and diverges when 
|a:| > 1. That is, the radius of convergence is 1. We still have to check what 
happens when x = 1 and x = —1. Let’s do ^ = 1 first. Substituting x = 1, 
the original power series becomes 


Does this converge? I leave it to you to use the integral test to see that it 
diverges (or see Section 23.5 in Chapter 23). Now let’s put a: = —1 in the 
original power series above to get 


会 nln ( n ). 

This doesn’t converge absolutely — in fact, the series obtained by replacing 
the terms by their absolute values is exactly the series when x = 1, which we 

S just saw diverges. On the other hand, the above series for a: = — 1 converges 
by the alternating series test (you supply the details — use the methods of 
/ x Section 23.7 of Chapter 23). So, we have conditional convergence at the 
point x = —1. Summarizing, the power series converges absolutely when 
—1 < a: < 1, converges conditionally when x = —1, and diverges for all other 
x. The picture looks like this: 


XXXXXXXXXX 


y x 

^ - 1 - ^ 


xxxxxxxxxxxx 
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Now consider 


^ x n 

厶 n(ln(n)) 2 . 


This is almost the same question as the previous one, but let’s see what 
happens. We have 


x n+1 


(n + l)(ln(n+ l)) 2 


|a: n+1 


(ln(n)) 2 


i(ln(n)) 2 


x n n + 1 (ln(n + l)) 2 | 


=lim | 工 |- 




(n) 


which again simplifies to \x\. So once again the power series converges abso¬ 
lutely when \x\ < 1 and diverges when |a:| > 1. The radius of convergence is 
therefore 1. As for the endpoints, let’s put in a; = 1: 


E 


l n 

n(ln(n)) 2 


=E 


n(ln(n)) 2 . 


As we’ve seen in Section 23.5 of Chapter 23, you can use the integral test to 
see that this converges; since all the terms are positive, the convergence is 
absolute. Now, when x = —1, we get 

v (-ir 

^^(ln(n))2 ‘ 


The series of absolute values of these terms is 


E 


n(ln(n)) 2 ’ 


which is the same as the series when x = 1, so it converges absolutely. We 
conclude that the power series converges absolutely when —1 < a: < 1 and 
diverges for all other x, giving the following picture: 


XXXXXXXXXXX 


i - 1 - 


xxxxxxxxxxxx 



So, it’s the same as the previous example, except for different behavior at the 
endpoints 1 and —1. 

How about 


^2n\x n l 
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We have 

=lim 

What is this last limit? Well, if or = 0, then this is just the limit as n —> oo 
of 0(n + 1) = 0, which is of course 0. (You may notice that the quantity 
x n+1 /x n isn’t well-defined in this case, though!) However, for any other value 
of x, we’re screwed — the limit is oo, which is certainly bigger than 1. We 
conclude that the series only converges when a; = 0 (remember, it has to 
converge at a; = a, which is 0 in this case). So, the radius of convergence is 0 
and the picture looks like this: 


(n + l)!x n+1 
n! x n 


lim (n+ 1)| 斗 


(n + l)!x n4 
n\x n 


XXXX XXXXXXXXXX XXX y xxxxxxxxxxxxxxxxxxx 

_j_, 

0 



Now consider 


择(卜 7 )' 



This is a power series with a = 7, so that point must be at the center of the 
region of convergence. In any case, check that we have 


lim 

n—^oo 


(_2 广 +1 0r_7) n+1 


VrT+T 


(-2) n (x-7) n 

y/n 


— i im (~ 2 ) n+i ( x - 7 ) n+i r^~ 

~ (-2) n ~~ (X - l) n V n + 1 


= 2\x-l\. 


So the power series converges absolutely when 2\x — l\ < 1 and diverges when 
2\x — l\ > 1. Dividing through by 2, we see that it converges when |a:—7| < \ 
and diverges when \x — 1\ > The radius of convergence is therefore so 
our picture looks like this so far: 


XXXX XXXXXXX 


? 




XXXXX XXXXXXX 



We still have to check the endpoints. Let’s try x = l\. Then the series is 


E^- 7 r = E^ = f：^ 

n=l v n=l v n=l v 



Make sure you realize why (—2) n /2 n can be simplified to (—l) n . Anyway, I 
leave it to you to show that this last series doesn’t converge absolutely (use 
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the p-test) but that it does converge conditionally (use the alternating series 
test). Now, when x = 6|, we get 



y. (_2 广 /_ 1 V _ y. (—2 广 1 

= hi ^ (~ 2 ) n 


E 




which diverges. So, we conclude that the power series converges absolutely 
when < x < l\ and conditionally when x = 7^, and diverges otherwise. 
The full picture is as follows: 



Now, this is a good candidate for the root test because of the complicated 
factor 2 n . You can work it out with the ratio test, but the root test is better. 
Consider the limit of the nth root of the absolute value of the nth term: 


M\^ x+2 ^ n \' = + 2 i") 1/n = ^ ^\ x+ 2 i- 

Now, regardless of the value of x, this limit is equal to 0, which is less than 1; 
by the root test, the power series converges absolutely for all x. That is, the 
radius of convergence is oo and the picture looks like this: 

-2 


Just one more comment, before we move on to the next section: note 
that when the radius of convergence is positive, you might get convergence at 
both endpoints, at neither endpoint, at the left endpoint only, or at the right 
endpoint only. We’ve seen examples of all four possibilities above. 


26.2 Getting Newtfciylor Series from Old Ones 

Let’s look at some techniques for finding Taylor series. One way to find the 
Taylor series about x = a of a given function / is to use the formula directly, 
as we did in Section 25.2 of the previous chapter. To use the formula, you 
have to find all the derivatives of /, at least a,t x = a. For most functions, 
this is a pain. Often a better idea is to use some common Taylor series to 
synthesize new ones. Of course, you have to know some Taylor series first! It 
is really useful to have the following five Maclaurin series (Taylor series about 
a: = 0) at your fingertips: 


















雜 • 

二 2|)2,1 

◎ 

◎ 

◎ 

◎ 


Taylor and Power Series: How to Solve Problems 

Substitufiapi ： and Taylor series: 

The most useful technique is substitution. In a Maclaurin series, you can 
replace x by a multiple of x n ^ where n is an integer, to get a new Maclaurin 
series. For example, we know that 


e x 


X 2 X s x 4 
¥ + ¥ + ¥ + ’ 


for any x; so if you want to find the Maclaurin series for f(x) = e x , simply 
replace x by x 2 in the above series to get 

x 2 , 2 b 2 ) 2 (々（々 

e = 1 +- 2 + Sr + i sr + i 4r + --' 


which you can simplify down to 


x 2 ^ 2 X- 1 X 6 X s 

e =l + a; + _ + _ + ir+ _ 

Since the original series holds for any x, so does this one. 

Let’s look at another common example: what is the Maclaurin series for 
f(x) = 1/(1 + x 2 )? To do this, start with the geometric series 


1 00 


which is valid for — 1 < x < 1; then replace x by —x 2 to get 


1 + x 2 




x 2 中龙 4 - 


which is valid for —1 < —x 2 < 1. Notice that we also replaced x by —x 2 in 
this “valid for” inequality! This isn’t important here, since the inequalities 
reduce to — 1 < a: < 1 anyway; but suppose instead we wanted to work out 
the Maclaurin series for 1/(1 + 2a; 2 ). Then we would have replaced x by —2a: 2 
instead. This gives 


2a: 2 


J2(~^ 2 ) n = ^{-l) n 2 n x 2n 


2x 2 -\-4x 4 -Sx 6 ^ 


but this is valid only for —1 < —2a; 2 < 1. Convince yourself that this in¬ 
equality reduces to —1/V^ < x < l/\/2. (By the way, all the series in these 
examples are geometric series.) 

Now, suppose you start with the following equation, which is true for all 


real x: 


sin ⑷ = 0 ： - 菩 + 香-菩 


The right-hand side is the Maclaurin series, or Taylor series about x = 0, of 
sin(a:). If you replace x by (x — 18), you get a Taylor series about a; = 18 
instead: 


sin(a: — 18) = (x — 18) — 


(x- 18) 3 


{x - 18) £ 

~5!~~ 


(x - 18) 7 

~7!~~ 
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The right-hand side is not the Taylor series about x = 18 of sin(a;), because 
the left-hand side is no longer sin(ar) — it’s sin(o:—18). So our substitution has 
translated the original function as well. We have actually found the Taylor 
series about a; = 18 of sin (a: — 18). To find the Taylor series of sin(a:) about 
x = 18, you have to use the formula in Taylor’s Theorem. (We looked at a 
similar problem at the end of Section 25.2 in the previous chapter.) 

The moral of this last example is that if you replace x by (x — a), then you 
get a Taylor series about x = a instead of a Maclaurin series, but the function 
is different. This can still be useful. For example, to find the Taylor series of 
ln(a:) about x = 1, start with one of the formulas from the previous section: 


ln(l + x) = ^2- 


(-l)V 


x 2 X s 


for — 1 < x < 


Now, let’s replace x by (x — 1). The quantity ln(l + ar) becomes ln(l + (x— 1)), 
or just ln(a:); so we get 


ln(o:) 






(x-i ) 4 


for _ 1 < (rr _ 1) < 1. 

Notice that I also replaced a: by (a; — 1) in the original inequality —1 < a: < 1, 
arriving at —1 < (a: — 1) < 1. This looks a bit silly, so add 1 everywhere to 
get 0 < a: < 2. We end up with 


^ ( -i 广 (a; -i 广 ， ^ (x-i) 2 , (x-i) 3 (x-ir , 

In(^) = 2^ --- = ( a： - 1 ) - 2~ ~3 - 4~ + … 

n=l 

for 0 < a; < 2. 

We have used the Maclaurin series of ln(l + x) to get the Taylor series about 
a: = 1 of ln(x). 

By the way, the substitution technique can also be used to find Taylor 
polynomials, but you have to be careful to get the order right. For example, 
if you take f(x) = e x and a = 0, the Taylor polynomial of order 3 is 

P 3 ( X ) = 1 + X+ - + -. 

Now if g(x) = e x2 , it’s a mistake to replace x by x 2 in the above polynomial 
and claim the third-order Taylor polynomial of g is 

= l + x 2 + — + 

This is actually the sixth-order Taylor polynomial of g about 0, so the left- 
hand side should say Pq(x) instead of Ps(x). To get the correct formula for 
Ps(x), just drop all the terms of degree greater than 3. This means that 
Ps(x) = 1 + x 2 . Of course, this is also 巧⑷ as well! Be careful with your 
degrees. That’s an order. (At least, if you want to pass calculus and get your 
degree ... ouch. OK, no more puns, I promise.) 
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26.2.2 Differentiating Taylor series 

If a power series converges to a differentiable function / on an open interval 
(a, 6), then it turns out that you can differentiate the series term-by-term 
to get a new series which converges to f f (x) on the same interval. The sit¬ 
uation at the endpoints a and 6 is a little trickier: the differentiated series 
might diverge even if the original series converges.* So check the endpoints 
separately. 

Our first example is to find the Maclaurin series for sin(a:), assuming that 
we know the Maclaurin series for cos (a;) is given by 


◎ 


◎ 


cos(a:)= : 


the formula is valid for all x. (We proved this in Section 24.2.3 of Chapter 24.) 
If you differentiate both sides, term-by-term on the right, you get 

.,, 2x 4x s 6x 5 Sx 7 

We need to multiply both sides by —1 to get rid of the minus sign on the 
left-hand side, but there’s another simplification to be made. We have to 
deal with quantities like 2/2!, 4/4!, 6/6! and 8/8!. Consider 4/4! for a second. 
Since 4! is actually 3! x 4, you can reduce 4/4! to 1/3! by canceling out a factor 
of 4. Similarly, 6! = 5! x 6， so we have 6/6! = 1/5!, and also 8! = 7! x 8, so 
8/8! = 1/7!. Altogether, the above equation becomes 

/V»3 /y.5 /y»7 

sin ⑷ + ^ -苦 + 

Since the series for cos(x) is valid for all : r, so is the differentiated series above. 
That is, the Maclaurin series for sin ⑷ is given by the above equation, which 
is valid for all x. This proves formula #2 in Section 26.2 above. 

Here’s another example of differentiating a power series. Suppose you want 
to find the Maclaurin series for f(pc) = 1/(1 + x) 2 . The best way would be 
to start with the series for 1/(1 + $), which is obtained from the standard 
geometric series (#4 above) by replacing x by —x: 


'-\-X 2 -x 3 -\-x 4 


this is valid for —1 < a: < 1. Then differentiate both sides, term-by-term on 
the right-hand side, to get 


-(TT^F = 0 - 1 + 2a： - 3a;2 + 4a;3 - 

All that’s left is to take negatives of both sides to get 


(1 + a :) 2 


2x + 3x 2 -4x 3 + = E(-l)"(n + l)x n . 


*By the way, if the differentiated series converges at one (or both) of the endpoints, 
then the original series converges there too. 
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I This is valid for —1 < a: < 1. (You should check that the expression in sigma 
notation is correct, and that the series doesn’t converge at the endpoints 
x = dzl.) 

Once again, you can apply these ideas to Taylor polynomials; you just 
have to be careful with orders, once again. Since differentiating a polynomial 
knocks the degree down by one, the differentiated Taylor polynomial is order 
one less than the original polynomial. For example, the third-order Taylor 
polynomial about 0 of 1/(1 + x) is 1 — x -\- x 2 — x s , as you can see from the 
previous example; if you differentiate and multiply by —1, you see that the 
second-order Taylor polynomial about 0 of 1/(1 + x) 2 is 1 — 2x 3x 2 . 


26.2.3 Integrating Taylor series 



You can also integrate a power series term-by-term. The new series converges 
in the same interval as the old one (except perhaps at the endpoints of the 
interval of convergence). If you use an indefinite integral, don’t forget the 
constant! Let’s see a few examples. First, let’s try to prove the following 
formula for ln(l—x), which we first stated as part of formula #5 in Section 26.2 
above but never proved: 


ln(l — x) = ^ —x 


T~T 


for — 1 < a; < 1. To do it, we’ll use the geometric series formula, which is #4 
in Section 26.2: 

I oo 

- - = x n = 1xx 2 x s -\ - , 

1 — X ^ 
n=0 

valid for —1 < 丨 < 1. Then integrate everything with respect to x: 


■ dx : 


J [ x n dx = J (1 x x 2 x s ) dx. 


(Note that I have used both sigma notation and expanded notation here, but 
you would normally only use one of the two.) Now integrate term-by-term: 


^ x n+l x 2 X S x 4 

-\n{l-x) = C + Y J —i = C + x + — + — ^ — + ■■- . 

n=0 

It’s a good idea to put the constant first instead of as +C at the end, since 
it’s really the zeroth-dgree term in the power series. Now we have to find out 
what C actually is. The best way is to substitute x = 0. In this case, we get 

0 2 0 3 0 4 

-ln(l-0) = C + 0+y + y+ — + •••, 

which reduces to C = 0. Substituting in and taking negatives of both sides, 
we get our series for ln(l — x) as before: 


i/i \ ^ xU x2 xS 

ln(1 — = T 一 — T —T —T — 

n=l 
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Since the original series (for 1/(1 — a:)) converges for -1 < a: < 1, so does the 
integrated series (for — ln(l — x), hence also ln(l — a:)). Actually, the series 
for ln(l — x) does also converge when x = —1; however, as I said, integrating 
power series term by term doesn’t give any information about the endpoints 
of the interval of convergence. By the way, now you can replace x by —x to 
get the expansion of ln(l + x) from formula #5 of Section 26.2 above. 

Another example: how would you find the Maclaurin series for tan -1 (a;)? 
This would be a real pain to differentiate over and over (just try it and see!), 
but we can be really sneaky and integrate a series we already know. Let’s 
see, tan -1 (a:) is an antiderivative of 1/(1 + x 2 ), and we saw in Section 26.2.1 
above that we have 


_l_ = 1 _ x 2 + x ^_ x e + . 

when —1 < a: < 1. We can now integrate both sides: 


J 士 dx = 


J (1 — x 2 x 4 — x 6 dx. 


Integrating term-by-term on the right-hand side gives 

tan-1 {x) = C-\-x-^ J r^--^--\ - . 

Now we substitute a: = 0 to find out what C is: 


q3 q5 q7 

tan - i( 0) = C + 0-y + — -- H - , 


which simplifies to C = tan _1 (0) = 0. So, we have 


tan -1 (a:) = x - 


x s x 5 


E 


(-l) n x 2n+1 
2n+ 1 


◎ 


(Check that you believe the sigma-notation version on the right-hand side.) 
Since the original series for 1/(1 — x 2 ) converges when — 1 < a: < 1, so does 
the series for tan -1 (a:).* 

Let’s look at an example of a definite integral. Suppose that a function / 
is defined by 

f(x) = f sin(t 3 ) dt. 

Jo 

What is its Maclaurin series? We should start by finding the series for sm(t s ). 
To do this, substitute x = t s in the Maclaurin series for sin(rc) to get 


sin(t 3 ) 


< 3 _誓+竽_ 




t 3 - 


3! +_ 5T _ 7f 


*In fact, the series for tan -1 (a:) also converges when x = 1 (or x — —1) by the alternating 
series test, eventually leading to the cute formula 
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Since the series for sin(a:) is valid for all real x, the series for sin ( 亡 3 ) is valid 
for all real t. Now, we can integrate both sides from 0 to a: to get 


/ ⑻ = 



t 9 t 15 t 21 

3! + ^T~ ?r 



dt\ 


integrating the right-hand side term-by-term, we get 



..._ a 4 t w t w t 22 

_x 4 x 10 x 16 rr 22 

= T _ To^! + 16^5! ~ 22^7! + ' 

this is valid for all real x. (You should try to convert this series to sigma 
notation. The answer is given in Section 26.3 below.) 

You can also apply the above integration techniques to Taylor polynomials; 
this time the order of the Taylor polynomial increases by 1. 



豫 2.4 Adding and sublractirigTdylor series 




If you know the Taylor series about x = a for two functions / and 仏 then 
the Taylor series for the sum f(x) + g{x) is of course the sum of the two 
respective Taylor series, at least in the overlap of the regions where the Taylor 
series converge. The same goes for the difference f(x) — g(x). The only thing 
you need to do in practice is to group terms of the same degree together, and 
worry about where the resulting series converges. For example, the Maclaurin 
series for sin(a:) — e x is given by 

(X s X 5 x 7 \ / x 2 X s x 4 x 5 x 6 x 7 \ 

卜 -^[ + H + ...H 1 + a: + ¥ + ¥ + ¥ + ¥ + ¥ + 7^.-> 

which should be simplified; after canceling, the series looks like 


x 2 2a: 3 x 4 x 6 2x 7 

2[ _ ~3r _ 4r _ 6r _ Tr 


at least up to terms of order 7. Since the series for sin ⑷ and e x are valid for 
all a:, so is the series for sin(a:) — e x . 

If you want to deal with Taylor polynomials, you have to be careful to 
take the order to be the lesser of the two orders. For example, we know that 
the third-order Taylor polynomial about 0 of 1/(1 - x) is 


1x x 2 X s , 


while the fourth-order polynomial of e x about 0 is 


l + o: + 


x 2 

¥ 


X s x 4 
+ ¥ + 4!" 


If you set f(x) = l/(l-x)-\-e x and look for its Taylor polynomial about 0, it’s 
no good taking the sum of the above two polynomials. The problem is that 
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you have a fourth-order term in the polynomial for e x , but no fourth-order 
term for 1/(1 — x). It’s like comparing apples and oranges. You pretty much 
have to ignore the x 4 /4! term in the polynomial for e x to get the third-order 
Taylor polynomial 

x 2 x s 
1 + X+ 2\ + 3\- 

Now you can add 1 x x 2 x s to the above polynomial to see that the 
third-order Taylor polynomial about a: = 0 for 1/(1 — a:) + e x is 


(l+x + x 2 +x 3 )+^l+x+^ + ^j , 


which simplifies to 


2 + 2:r + 字 + 


7x s 

~ 6 ~ 


26.2.5 Multiplying Taylor series 



You can also multiply two Taylor series to get a new one which converges 
to the product of the two relevant functions, at least in the intersection of 
the regions where the Taylor series converge. Writing this in sigma notation 
can get pretty messy and usually involves double sums. Normally one is 
interested in the first few terms of a series. For example, let’s find the terms 
of the Maclaurin series up to and including third order for f(x) = e x sin(a:). 
To do this, write out the series for e x and sin ⑷ up to third order, multiply 
out, and ignore any terms greater than third order: 

e x sin(a;) = (1 + 2：+ 皆 + ^ _ + ...)(1 

=1 (x — 


l) + x{x) + y( x ) + ■■- 


2 xS 

:+ x + Y + 


There’s a skill in ignoring terms you don’t need. For example, I left out the 
product of the terms x and — a: 3 /6 from the first and second sums, respectively; 
this is because I realized that this would give a term in a: 4 , which I don’t care 
about since we only need terms up to third order. If I wanted terms of up to 
fourth order, then of course I would have had to worry about more terms. 

Actually, it’s important not to pay attention to terms of order higher than 
the ones you’ve actually written down for the original functions. For example, 
take the second-order Taylor polynomial of e x about 0, which is 

x 2 

1 + ^ + ^T； 


now multiply it by the second-order Taylor polynomial of e~ x about 0, which 
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You get 


which simplifies to 


4 ) 卜 + 害 ) 


T 


If you look at this and claim that this is the fourth-order Taylor polynomial 
about 0 for the product (e x )(e~ x ), you’d be wrong! After all, the product is 
just 1， so all of its Taylor polynomials are just 1. The correct thing to do is 
to ignore all terms in the product of degree greater than 2. After all, we only 
started with second-order polynomials, so why should we expect anything 
higher when we multiply these polynomials together? In the above polyno¬ 
mial 1 + x 4 /4：, the term $ 4 /4 is of degree higher than 2, so it’s not accurate 
and should be ignored. The second-order polynomial for the product is 1, 
and that’s all you can tell from the product of the two second-order Taylor 
polynomials we started with. Don’t bite off more degrees than you can chew! 


26.2.$ Dividing Toylor-'serief- 


◎ 



in the long division for 1/ cos(a;): 

1 + \x 2 善 ^x 4 + … • 

•• Oa; - |a; 2 + Oa; 3 + 長 x 4 - ) 1 + Oa; + Oa; 2 + Oa; 3 + Ox 4 + …— 

— J 1 + Oa;- \x 2 + Oa; 3 + 去 x 4 + …. 

\x 2 + Oa; 3 - ^x 4 H —— 

|a; 2 + Oa; 3 - \x A + ... 

So the Maclaurin series for sec ⑷ is 1 + x 2 /2 + 5a; 4 /24 H - , up to terms of 

fourth order. 

If instead we would like to find the Maclaurin series for tan (a:) up to 
fourth order, we could proceed similarly, since tan(a;) = sin(rr)/ cos (: r). Using 
sin ⑷ =x — x s /6 + • • • and cos(x) = 1 — x 2 /2 -h x 4 /24 ： —…， the division 
would begin like this: 



You can do exactly the same thing with quotients by using long division. The 
trick is to ignore all but the terms of order up to the one you are interested 
in. For example, to find the Maclaurin series for f(x) = sec(x) up to fourth 
order, first write sec(x) as 1/ cos(a:), then set up a long division just as you 
do with polynomials. The main difference here is that you should write the 
terms so that the degrees are increasing, instead of in the normal decreasing 
manner. Since we’re interested in terms up to fourth order, we’ll use 


1 + OiC _ H - 0^3 + 去 工 4 — ... ) 0 + ic + 0x^ - + 0 工 4 + •… 
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I leave it to you to do the calculation and see that tan(x) = x-\-x s /3-\ - up 

to terms of fourth order (note that the fourth order term here is actually 0). 

So the moral of the story is that you may not have to differentiate over 
and over again and use the formula for Taylor series. If you’re lucky, you can 
instead use some of the five basic series, plus one or more of the techniques 
of substitution, differentiation, integration, addition, subtraction, multiplica¬ 
tion, and division. 


2^).3 ： |Jsing Power and ： Taylor Seriesind Derivatives 


Recall the formula for the nth coefficient of the Taylor series of f(x) about 
x = a: 

ctn = 



Let’s multiply through by n! to arrive at the following formula: 


|/( w )h)=n!x~^ 


In words, this means that 


/(™)(a) = «! X 


f the coefficient of (x — a) n in the 、 
yTaylor series of f(x) about x = a) 


◎ 


So if you know the Taylor series of a function about some point a, you can 
easily find the derivatives of that function at a. This is all you get! There’s 
no information about the value of the derivatives at any other value of x\ it’s 
only x = a. (Actually, to find the nth derivative, you only need a Taylor 
polynomial at a; = a of order n or more, not the whole Taylor series.) 

To use the above equation, you need to start by finding an appropriate 
Taylor series for your function. The techniques from the previous few sections 
can be really useful for this. For example, suppose that f(x) = e^ 2 , and we 
want to find /( 100 )(0) and /( 101 )(0). We kick off by finding the Maclaurin 
series for e x : 


e ^ 2 



2 X 4 X 6 

1+X+ 2i + 3i + - 


By the boxed formula above, 

/( 10 °)(0) = 100! x (coefficient of a: 100 in the above Maclaurin series). 

So what is the coefficient of a: 100 in the Maclaurin series, anyway? You can 
look at it and just see that it’s 1/(50!), or if you want to be more formal about 
it, you can work out which value of n will give you a: 100 . In particular, we 
want to locate the term x 2n /n\ that is a multiple of a: 100 . This means that 
2n = 100, so n = 50, and the term is a: 100 /(50!). So the coefficient is 1/(50!). 
This means that 

/_(◦) = ■!><‘d:. 
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(Don’t make the mistake of trying to simplify this last expression down to 2!; 
factorials don’t work that way.) Now, how about finding /( 101 )(0)? This is 
equal to 101! times the coefficient of x 101 in the above series. What is that 
coefficient? Hang on, there are no odd powers at all in the series! Put another 
way, what value of n would give you a: 101 ? It would have to solve 2n = 101, 
but n has to be an integer, so the power a; 101 isn’t present. That means that 
the coefficient of a: 101 is 0, so 

严 oi)( 0 ) = 101 ! x o = 0. 

All right, let’s see a more difficult example. In Section 26.2.3 above, we 
found that the Maclaurin series of the function /, defined by 

f(x) = [ sin(t s ) dt, 

Jo 

is given by 

x 4 x 10 a: 16 a: 22 

T _ 10-3! + 16-5! _ 22 • 7! +... ； 

this series converges to f(x) for all real rr. I now ask you this: what is /( 50 ) (0)? 
How about /( 52 )(0)? To do this, we are going to need the coefficients of x 50 
and x 52 in the above series for f(x). Remember, /( 50 )(0) is 50! times the 
coefficient of a: 50 in the Maclaurin series of /(x), and of course the same is 
true for /( 52 ) (0) except with 52 instead of 50 everywhere. 

Now, to find the coefficients of x 50 and x 52 in the above series, you could 
keep on writing it out until you got far enough. A better way is to change 
the series to sigma notation. I challenged you to do this as an exercise earlier; 
here’s how you do it, in any case. Note that the powers of a: go 4, 10, 16, 22, 
and so on. This means that they go up by 6 every time, starting at 4. So, 
the exponents are given by 6n + 4, where n runs through the numbers 0, 1, 
2, 3, and so on. Now, let’s look at the denominator. It’s the product of the 
quantity 6n + 4 and a factorial of an odd number. The odd numbers go 1, 3, 
5, 7, ..., so they are given by 2n +1. So, the denominator is (6n + 4)(2n + l)!. 
Finally, the terms alternate, beginning with positive sign, so there should be 
a (—l) n in there as well. We have now seen that 

(-梦 + 4 

； ^(6n + 4)(2n + l)!. 

Now we can finally find the coefficients of x 50 and x 52 . For the first one, try 
to solve 6n + 4 = 50. This would give n = 23/3, which is not an integer, so 
the coefficient of x 50 is 0. This means that 

/( 50 )(0) = 50! x (coefficient of a: 50 ) = 50! x 0 = 0. 

On the other hand, for x 52 , try to solve 6n + 4 = 52. This gives n = 8, so we 
can get the coefficient of x 52 by looking at what happens when we put n = 8. 
The term in the sum given by n = 8 is 

(_1)8^6x8+4 — ^52 

(6x8 + 4)(2x8 + l)! = 52 x 17!’ 
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so the coefficient is 1/(52 x 17!). Finally, 



,(52) ⑼ = 52 ! x (coefficient of x 52 ) = 52! x ^ ^ ^ . 

Notice that I did a little canceling here: 521/52 = 51!. Convince yourself that 
this is true before proceeding! 

Sometimes a function is already defined by a power series about x = a, 
and you may need to find certain derivatives of the function at a. This is even 
easier than the above examples, since you don’t have to find the Taylor series 
first. For example, suppose f(x) is defined by 


/w = E 

n=0 


(-l) n+1 n 3 (x-6) 3n 

n! 


which converges for all x (why?!?). Say that you want to evaluate /( 300 )(6). 
Well, the power series is about a: = 6, so we can use the formula 

/( 300 )(6) = 300! x (coefficient of (x — 6) 300 in the above series). 

To see what the coefficient is, we should find out which value of n gives the 
correct term. Looking at the above series, the general exponent of (x — 6) is 
3n, so we need the term where 3n = 300. Thus n = 100, and substituting, we 
see that the correct term is 

(-l) 100+1 100 3 Or-6) 300 — -1000000, 一 〜 3 。。 

100 ! 100 ! ~ 6) • 

So the coefficient is —1000000/100!. If you want to get really fancy, you can 
write 100! as 100 x 99! and cancel out a factor of 100 to see that the coefficient 
is —10000/99!. Anyway, this shows that 

/ (300)( 6 ) = 30Q ! x^ = _300!xJOOOO 

S What if you wanted to find /( 301 )(6)? I leave it to you to show that there is 
/ 〆 no term (x — 6) 301 appearing in the power series, so the answer is 0. 


26.4 Using Maclaurin Series to Find Limits 


You can also use some Taylor series to find certain limits. In particular, if 
you have a limit like 




where both the numerator and the denominator are 0 when x = 0, then you 
could use PHopitaFs Rule; however, if you wanted to evaluate 


lim 



e~ x2 + x 2 cos (a:) — 1 
1 — cos(2x 3 ) 


you’d have to be stark raving mad to do it that way. The numerator and 
denominator are no fun to differentiate once, let alone the six or so times 
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you’d actually have to do it (as it turns out). So, the correct method is to 
replace everything in sight by enough terms of the appropriate Maclaurin 
series. What do I mean by “enough terms ”？ Well, we expect that some 
terms might cancel, and we don’t want to be left with 0 in the numerator or 
the denominator. Let’s try going up to eighth order first. Let’s write down 
Maclaurin series for everything involved. First, since 


e x = 


replacing x by —x 2 ^ we get 


x 2 a: 3 a: 4 

t + t + 24 + - 


r 2 2 工 8 

3 =1 -" + y-y + 24- 


x 4 x 6 

T + 24~ 6! +， 


Now, since 

cos ⑷ = : 

we can get a series for x 2 cos(x) by multiplying through by x 2 : 

x 4 x 6 x 8 

a: 2 cos(a;) =a; 2 -y + — - — + •••. 

If instead we go back to the series for cos (a:) and replace x by 2a: 3 , we get 

where we don’t even need this last term, let alone any higher ones, since we 
have decided to go up to order 8. Still, it doesn’t hurt to put it in, so we’ll 
leave it. Anyway, if we put all this together, the numerator is 




- x 2 cos(x ). 


/ 2 a; 4 x 6 x 8 \ ( 2 a; 4 ar 6 ar 8 \ , 

I , 1 - 21 + T-y + 24 - + + + 


= ~r 6+ (^~7^o)/ + … 

whereas the denominator becomes 


— cos(2a; 3 ) = 1 — (1 — 2a: 6 + 誉： r 12 - ) = 2a: 6 — 誉 a 


Now, substituting into the limit, we have 


e~ x2 H- x 2 cos(x) — 
1 — cos(2x 3 ) 




( 去 - 士 ): 


Divide top and bottom by the lowest power, x 6 , and plug in a: = 0 to see that 
this limit is equal to 
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So, as you can see, the terms involving order higher than 6 didn’t come into it 
at all (which is incidentally why I never bothered simplifying the expression 
1/24 — 1/720). Basically, if everything cancels out, you haven’t used enough 
terms, whereas if something is still left, you’ve gone far enough and can pro¬ 
ceed. If you’d only gone up to terms of order 5 (or less), then you would have 
gotten 0/0 again, so you wouldn’t have gone far enough. 

Let’s look at one more example: find 

lim 

x—^0 


(sin(o:) e x — l) 


This doesn’t look like a fraction, so the first step is to do some algebra. Take 
a common denominator, just as we did in the case of PHopital Type B1 limits 
in Section 14.1.3 of Chapter 14, to write the limit as 


e x - 1 — sin ⑷ 
sin(a:)(e x — 1) 


Now we have 


and 


e--l = x +^ + ^- + . 


Putting all this in, the limit becomes 



Now, once again, the lowest power dominates as o: — 0; to see this, divide top 
and bottom by x 2 . Let’s be sneaky about it, though: on the bottom, we want 
to divide both factors by x, which is the same as dividing the whole thing by 
x 2 . The limit becomes 



Once again, it doesn’t hurt if you write extra terms —— I only used up to third 
order here, but higher orders would be fine. Actually, the third-order terms 
didn’t even come into it at all, and in the denominator we only needed the first- 
order terms. Unless you are psychic or have a really good intuition about such 
things, it’s pretty hard to guess how many terms you need. So, it’s better to 












Section 26.4: Using Maclaurin Series to Find Limits • 573 


use more terms rather than fewer; you can always ignore them later, whereas 
if you use too few terms, you can’t even solve the problem. 

Here’s the real reason all the above limits work: if / has a Maclaurin series 
with lowest-degree term aNX N , then 


/⑷〜 a〆 


f(x) = aNX N + Rn(x) = (inx n 


We mentioned this fact way back in Section 21.4.5 of Chapter 21; it’s useful 
in conjunction with the limit comparison test. In fact, the above equation 
is true even if the Maclaurin series for / doesn’t converge for x ^ 0. So 
there’s no need to work with the complete Maclaurin series: the lowest-order 
nonzero Taylor polynomial for / about a; = 0 is good enough. There’s just 
one technical condition, which is that the (N + l)th derivative of / has to be 
bounded near 0. Here’s how the whole thing works: by Taylor’s Theorem, we 
have 

f iN+1 \c) x N +1 
(iV + 1)! ’ 

where c is between 0 and x. Now divide both sides by <xnx n to get 

m 1 . / (jv+i) (c) r 

clnx n a_/v(iV + l)! • 

The quantity f^ N+1 \c)/(aN(N + 1)!) on the right-hand side is bounded in 
absolute value as a: — 0, since the denominator is constant and we’ve assumed 
the numerator is bounded. Now you can use the sandwich principle to show 
that the last term on the right-hand side of the above equation goes to 0 as 
: r — 0. That is, 

f( x ) 


This is the same as saying that 

/ ⑻〜 


0, 


and we have proved our claim. So what? Well, not only do we get a handy 
tool to use with the limit comparison test, but we’ve actually proved that all 
the above limits work. For example, to really nail the above limit 


e x -1- sin(x) 
aj™ sin(x)(e x — 1) 


we should note that e x — 1 — sin(a:) has a Maclaurin series beginning with 
x 2 /2, so e x — 1 — sin(a :) 〜 x 2 /2 as a: ^ 0; similarly, sin(x) 〜 a: as :r — 0, and 
e x — 1 〜 rasa;—>0. Since you can multiply and divide these asymptotic 
relations (but not add or subtract them!), we can say that 

e x -l- sm(x) x 2 /2 ^ 

~ : ~ 7 ~w - 77 ~ w ~7 8 .S X ― > 0 . 

sm ⑷ (e x — 1) ⑷⑷ 

The right-hand side is just 1/2, so we have proved that 
—1 — sin ⑷ 1 


lim 

a ;—^0 


sin(a:)(e x — 1) 2’ 
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In reality, the above method (using the full series with the + • •. notation) is 
generally accepted, even though technically it dances around the true issue. 
What’s really going on is shown in the above argument involving the remainder 
term Rn. 



CHAPTER 27 


Parametric Equations and Polar Coordinates 


So far, we’ve sketched the graphs of many equations of the form y = f(x) 
with respect to Cartesian coordinates. Now we’re going to look at things in 
a different way: first ， we’ll look at what happens when the coordinates x and 
y are not directly related, but are instead related by a common parameter; 
and then we’ll see what happens when we replace the whole darn coordinate 
system with something entirely different. Of course, we have to do some 
calculus too. So here’s the program for this chapter: 

• parametric equations, graphs and finding tangents; 

• converting from polar coordinates to Cartesian coordinates, and vice 
versa; 

• finding tangents to polar curves; and 

參 finding areas enclosed by polar curves. 

27.1 Para metric Eq 00+io ns 

When you write an equation like y = x 2 sin(a:), you are expressing y as a 
function of x. So if you have a particular value of x in mind, then you can 
easily find the corresponding value of y by plugging that value of x into the 
above equation. On the other hand, consider the relation x 2 -\-y 2 = 9. Now if 
you have a particular value of x in mind, you have to work a little harder to 
find the corresponding value of y. In fact, there may be multiple values of y 
which correspond to your value of a;, or there may be none at all. Of course, 
you can write y = — ^ 2 ； this means that there are actually two values of 

y corresponding to a: if —3 < a: < 3, but only one value oiy \ix = 士 3 and no 
values of y otherwise. 

Now let’s try a different approach: suppose that both x and y are functions 
of another variable t. For example, we could set 

x = 3 cos(t) and y = 3sin(^). 

So I’m asking you think of a; as a function of if you like, you could even 
write x(t) = 3 cos ⑷ to emphasize this. The same goes for y. If you pick a 
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particular value of t, then you can get corresponding values for both x and y 
by plugging your value of t into the above equations. The variable t is called 
a parameter, and the above equations are called parametric equations. 

What does the graph of the above pair of parametric equations look like? 
Let’s try plotting points. Instead of the normal technique of picking some 
values of x and finding the corresponding values of y, we instead pick values 
of t and find the corresponding values of both x and y. To plot the points, only 
use the values of x and y — there is no 亡 -axis involved! Anyway, since there are 
trig functions around, we should make sure that all of our test values involve 
7r. Indeed, suppose we try the following values of t: 


t 

0 

7r/6 

7t/4 

7t/3 

tt/2 

X 






y 







If we work out the corresponding values of x and y using the above equations 
x = 3 cos(t) and y = 3 sin ⑷， we can fill in the table like this: 


t 

0 

7r/6 

7r/4 

tt/3 

7r/2 

X 

3 

3\/3/2 

3/x/2 

3/2 

0 

y 

0 

3/2 

3/V2 

3^3/2 

3 


Sot = 0 corresponds to the point (3,0), and t = 7r/6 corresponds to the point 
(3v^/2,3/2), for example. Here’s a graph showing all five points: 



t = 7r/2 

O * 

' • t = 丌 /3 


• t = 7t/4 


• t = 7r/6 


，亡 = 0 

0 

3 


It seems as if we are dealing with a quarter-arc of a circle of radius 3 units 
centered at the origin. This should come as no surprise, knowing what we 
know about trigonometry! (Of course, for any value of it is true that 
x 2 y 2 = (3cos(t)) 2 + (3sin(t)) 2 = 9(cos 2 ⑷ + sin 2 ⑷ ）= 9.) Now if you 
continue the above table up to t = 7r, you describe a semicircle, whereas 
if you go all the way to 亡 = 2 兀 , you get the full circle. What happens if 
you keep going? Well, you just start to retrace the circle. The same thing 
happens if instead you start at 亡 = 0 and make t go more negative, except 
that now you move around the circle clockwise instead of counterclockwise. 
Notice that if you pick a point (x,y) on the circle, there isn’t just one value 
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of t which corresponds to that point! There are infinitely many, all separated 
by multiples of 2tt. For example, if n is any integer, then t = 27m corresponds 
to x = 3 and y = 0, that is, the point (3,0). 

So, the above pair of parametric equations describes the circle x 2 -\-y 2 = 9, 
at least if you let t range over a large enough interval — for example, [ 0 , 2 丌 ). 
You can say that 

x = 3cos(t) and y = 3sin ⑷， where 0 < t < 27r 

is a parametrization of x 2 y 2 = 9. Now, I ask you this: is the graph of 
x 2 y 2 = 9 the same as the graph of the above parametrization? Yes and 
no. Certainly the two graphs look like the same circle, but the parametric 
version tells you a little more: it tells you how the circle is drawn. If you start 
at t = 0 and move continuously up to 亡 = 2 丌 , then you trace out the circle 
by starting at (3,0)，then drawing counterclockwise at a constant speed until 
you’re back at the starting point. 

The whole thing is sort of like looking at a slime trail left by a snail, com¬ 
pared with actually watching the snail move and leave the trail. Just looking 
at the trail isn’t enough to tell you in which direction the snail moved — it 
might even have backtracked! You also can’t tell how fast it was moving at 
different times along the trail. (No, “at a snail’s pace” is not a scientific de¬ 
scription of how fast it was moving.) Having a parametrization is like knowing 
where the snail is at each time; it allows you to find the extra information of 
direction and speed. 

So is the above parametrization the only possible one for x 2 -\-y 2 = 9? No 
way. There are many other ways to draw the same circle. For example, you 
could put x = 3 cos(2i) and y = 3sin(2 亡 ). Now you only need t to range from 
0 to 7 r to cover the whole circle, and in fact you go around twice as fast as 
you did before. Alternatively, you could try x = 3 sin ⑷ and y = 3 cos ⑷ for 
0 < t < 27r. Now you’re back to normal speed, but this time you start at 
(0,3) and go clockwise around the circle instead of counterclockwise. Convince 
yourself that these facts are true by plotting a few points. 

How would you find a parametrization for x 2 + Ay 2 = 9? Sketching this 
curve gives an ellipse through (±3,0) and (0, 士 3/2). If you set Y = 2y, then 
x 2 j^y 2 = 9 . This is a circle in the new coordinates (x, Y), so we can use our 
above parametrization: x = 3cos(0) and Y = 3 sm(6) for 0 < 0 < 27r. Now 
we just have to write y = Y/2 to get the parametrization 
3 

x = 3 cos(t) and ?/ = - sin(t), where 0 <t <27r 

for the ellipse. This is not the only possible parametrization, of course! 

How about x 6 -\-y 6 = 64? I leave it to you to sketch this curve and see that 
it looks like a bloated circle of “radius” 64 1 / 6 = 2 units. This should inspire 
us to adapt the above parametrization of the circle. First, we need to change 
the radius to 2 units: indeed, x = 2 cos ⑷ and y = 2 sin ⑷ would do the circle 
x 2 + y 2 =4 but it fails for the bloated circle, since it’s not true in general 
that cos 6 (t) + sin 6 (t) = 1. How do we fix this? Well, let’s replace cos(t) by 
some power of itself so that when we take the 6 th power, we get cos 2 (t). That 
would have to be cos 1 / 3 ^). So if we try x = 2cos 1 / 3 (t) and y = 2sin 1 / 3 (t), 
then this should work. Let’s test it: 

x 6 + y 6 = (2cos 1/3 ⑷ ) 6 + (2sin 1/3 ⑷ ) 6 = 64 cos 2 (t) + 64sin 2 ⑷ = 64, 











: urve, we iex. i range irom u xo 


IS 

l better do some calculus with this parametric 
tangent line to the curve, we’ll need a deriva- 
:e both functions of t, we have to use the chain 

dy dy dx 

dt dx dt’ 

nd rearrange to get 

dy dy/dt 
dx dx / dt’ 


and similarly for y, then you can rewrite this 


equation of the tangent line at 
imetric curve defined by 

-1 < ^ < 1 . 

1 

we might as well evaluate the 

1 _ _ 2 _ 

~ VI - 1/4 _ \/3' 


e 

: _7T 

3 tangent line? Well, this line 
e know what the slope is, but 
=1/2 in the original equations 
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which simplifies slightly to 




Now for a trickier example. Suppose we want to find the equation of the 
tangent to the curve x 6 -\-y 6 = 64 at the point (—2 5 / 6 ,2 5 / 6 ). (You should check 
by substituting that this point is actually on the curve.) This can be done by 
implicit differentiation, but let’s try using our parametrization x = 2 cos" 3 (t) 
and y = 2sin 1 ^ s (t) from the end of the previous section; here 0 <t < 2 tt. We 
get 

cos _2 / 3 (’) sin ⑷ and ~J 7 = \ sin _2 / 3 (’) cos ⑺. 

at 6 at 6 

So by the chain rule, 

dy _ dy/dt _ | sin _ 2 / 3 (t)cos(t) _ cos 5/3 ⑷ 
dx dx/dt cos _ 2 / 3 (i) sin ⑷ sin 5 / 3 (t) * 

We want to know what happens at (—2 5 ’ 6 ,2 5 / 6 ). Let’s set x = —2 5 ’ 6 ; since 
x = 2 cos 1 ’ 3 ⑴， we see that 2 cos " 3 6) = -2 5 / 6 , so cos(t) = —l/y/2. If you 
play the same game with y, you’ll find that sm(t) = l/y/2. You could now find 
t if you like —— if you think about it, you should be able to see that t = 37r/4 is 
the only solution between 0 and 2tt. But in any case, you don’t even have to 
find t, believe it or not! Knowing just the values of sin(t) and cos ( 亡 ） is enough 
to substitute into the above expression for dy/dx to get 


dy _ cos 5 〆 3 ⑷ — （ _l/\/5) 5 /3 _ i 
di~~ sin 5 / 3 (t) _ (l/v/ 2) 5 / 3 . 

So we have found that the slope of the tangent line is 1. To find the equation 
of the line, we know it passes through (a:, y) = (— 2 5 / 6 , 2 5 / 6 ) and has slope 1 , 
so its equation is 

y - 2 5 / 6 = l(a:-(- 2 5 / 6 )); 
make sure you see why this can be simplified to 
y =X + 2 11 ^. 


Now for our trickiest example (conceptually speaking, at least). Suppose 
that we are given the following parametric equations: 

x = At 2 — 4 ： and y = 2t— 2t s for all real t. 


These equations describe a curve in the a:,y-plane; let’s find the equation of 
any tangent line to this curve at the origin. Notice that I said “any” instead 
of “the.” There’s a reason for this! Let’s try to work out which value of t 
corresponds to the origin. At the origin, both x and y are 0, so we’ll need 
x = 4 ： (t 2 — 1) = 0 and y = 2(t — t 3 ) = 0. The first of these equations holds 
only when t 2 = 1, so t must be 士 1. Both of these values satisfy the second 










through the origin with slope —1/2. Its equation must therefore bey =— 
On the other hand, when t = —1, we have dy/dx = 1/2, so now the tan 
line is y = x/2. Let’s see why this is plausible by sketching the curve. T 
this, let’s take some values of t and work out the corresponding values 
and y: 


t 

-2 

3 

_2 

-1 

1 

_2 

0 

1 

2 

1 

3 

2 

2 

X 

12 

5 

0 

-3 

-4 

-3 

~0~ 

5 

12 

y 

12 

15 

T 

0 

3 

_4 

0 

3 

4 

0 

15 

一 T 

-12 


Plotting these points and making an educated guess, the curve should 
something like this: 
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itself is the derivative of y with respect to x. Then the problem becomes easy. 
We already saw above that 


y’ = 


dx At 


St 

T 


so without substituting t = 1 yet, we now use the chain rule (and the fact 
that x = At 2 — 4) to write 

—1 3 

d 2 y _ dy’ — dy f /dt _ dt \4：t 4 y _ 4^2 — 4 _ 1 3 

dx 2 dx dx/dt d ( 2 _ . St 32t 3 32t 

Jv ~ ] 

Now we can finally substitute t = 1 to see that 


d 2 y 

dx 2 



8 


As a reality check, look at the above graph. The relevant portion of the curve 
when 亡 =1 is actually the top half of the loop to the left of the y-axis, moving 
down through the origin into the fourth quadrant. If you just focus on this 
part of the curve near the origin, you can see that it is indeed concave down, 
so at least we have convinced ourselves that the second derivative should be 
negative, as we found above. 


2?.2 Polar Coord 

Suppose your friend is standing in a big flat field at a point that you both 
agree will be the origin. You’d like to tell him or her how to get to another 
spot in the field. If you use Cartesian coordinates, then you might tell your 
friend to go to the point (x, y), where this means that your friend should walk 
x units to the east and then y units north. (You’ll have to agree on what 
units you’re using in advance.) Of course, if a; or y is negative, this means 
that your friend has to walk backward for the appropriate amount. Also, your 
friend could walk y units north and then x units east — that still gets him or 
her to the same place. 

Instead, you could tell your friend to face due east, then call out an angle 
for him or her to turn in the counterclockwise direction (while staying at the 
origin). If the angle is negative, that means your friend should turn clockwise 
instead. After that, you call out a distance for your friend to march in the 
direction he or she is facing. If the distance is negative, it’s a backward march. 
So instead of coordinates in Cartesian form (x^y), your friend will get (r,0); 
here 0 is the amount to turn and r units is the distance to march. 

If the point you want to describe is actually the origin, then you could 
tell your friend (0, 6) for any angle 6. It doesn’t matter how much he or 
she turns — there will be no marching, so your friend just stays at the origin. 
Also observe that you could add onto the angle 0 and it wouldn’t make a 
difference. Your friend would simply spin around a full revolution in addition 
to 9. The same thing goes for 4 丌， 6 丌 , or any other integer multiple of 2 丌 , 
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even negative multiples —— it just depends how sadistic you want to be, making 
your friend spin around many times without purpose just to make him or her 
dizzy! Anyway, now it’s time to look at some formulas. 

27.2.1 Converting to and from polar coordinates 

Consider the point (r, 6) in polar coordinates, which could look something like 
this: 



◎ 


Remember, your friend started at the origin facing toward the positive direc¬ 
tion on the x-axis, then turned counterclockwise an angle 0, then marched 
forward r units to get to the point P. What are the Cartesian coordinates 
(x,y) of P? Well, we know that cos(0) = x/r and sin(0) = y/r, so that gives 

us _ 

a: = r cos(0) and y = r sin(6). 

(Compare this with the example x = 3cos ⑷， y = 3sin(t) from Section 27.1 
above.) Anyway, these equations show us how to convert from polar to Carte¬ 
sian coordinates. For example, what are the Cartesian coordinates of the point 
given in polar coordinates by (2, 11 冗 /6)? First, it’s not a bad idea to draw a 
picture: 
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The picture shows that the reference angle is 2 丌 一 11 兀 /6, which equals 7 r/ 6 . 
We are in the fourth quadrant, so cosines are positive and sines are negative; 
we therefore have x = 2cos(ll7r/6) = 2 . (y^/2) = V^3, and we also have 
y = 2 sin(ll 7 r/ 6 ) = 2 - (—1/2) = —1. That is, the Cartesian coordinates are 
(\/3,-1). 

It’s always easier translating from a foreign language into your native 
language than the other way around; the same thing happens with polar 
coordinates. It’s a little harder getting from Cartesian coordinates to polar 
coordinates. The easy part is r, since by Pythagoras’ Theorem, r 2 = x 2 -\-y 2 . 
(You can also see this by squaring both equations in the box above, then 
adding them together and using cos 2 (x) + sin 2 (x) = 1.) How about 01 We 
know tan( 0 ) = y/x provided that x ^0, but that doesn’t tell us exactly what 
0 is. You could always add any integer multiple oi n to 6 without changing 
the value of tan(0). So you should draw a picture to see what’s going on. 
Here’s a summary of the situation: 


r 2 = x 2 y 2 and tan( 0 ) = — if a: ^ 0 , but check the quadrant! 
x 


Let’s look at an example: suppose we want to write (-1,-1) in polar co¬ 
ordinates. If you put x = —1 and y = —1 in the above formulas, you get 
r 2 = (— 1) 2 + (—l ) 2 = 2 and tan(0) = (— 1 )/(— 1 ) = 1 . So it looks like r = \f2 
and 0 = tan _ 1 (l) = 7 r/ 4 . This can’t be right, though! Check out the following 
picture: 



The point with polar coordinates (v^2, 7 t/4 ) is the wrong point, since it’s in 
the first quadrant. The correct point is in the third quadrant, and as you can 
see from the picture, its polar coordinates should be (\/2, 57 t/4 ). 

So, where did we go wrong? Well, we said that tan(0) = 1, so 0 = 7r/4. 
We forgot about the solution 0 = 5 丌 /4. Actually, we also said that r 2 = 2 , 
so r = y/2, neglecting the solution r = —\f2. If you look at the above picture 
again, you can see that the point (— 1 ，一 1 ) could also be written in polar 
coordinates as (—v^2,7r/4). If your friend is standing at the origin, facing 
toward the wrong point, but then walks backward for y/2 units, he or she will 
be at the correct point after all. 
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27 . 2.3 


◎ 



Parametric Equations and Polar Coordinates 

Finding tangents to polar curv ㊀ s 

Luckily, finding tangents to polar curves is just a special case of finding tan¬ 
gents to curves given by parametric equations. We’ve seen how to do this in 
general in Section 27.1.1 above. Let’s see how it works in the case of polar 
coordinates. 

We have r = f(6), and we’d like to find the tangent to the curve at some 
point on the curve. Using x = r cos{6) and y = r sin(0), we can write 

x = f(0) cos(6) and y = f(6) sin(0); 

this means that x and y are parametrized by 6. By the formula from Sec¬ 
tion 27.1.1 above, we have 

dy dy/dO 
dx dx/d6' 

This gives the slope of the tangent in general. Finally, we just have to plug 
in the value of 6 we care about. That’s all there is to it, but let’s see what 
happens when we look at some examples. 

Consider the curve given in polar coordinates by r = 1 4 - 2cos(0). We 
sketched this in the previous section. Suppose we want the equation of the 
tangent through the point with polar coordinates (2,7r/3). First, let’s do a 
reality check: does this point even lie on the curve? Well, when 6 = 丌 /3, 
we have 1 + 2 cos(6) = 1 + 2cos(7r/3) = 2, which is the given value of r. 
So the point does lie on the curve after all. Next, we have to find the slope 
of the tangent, dy/dx. We have x = r cos(0) = (1 + 2cos(0)) cos(0), and 
y = r sm(6) = (1 + 2 cos(0)) sm(0). We need to find dy/d6 and dx/d6. Unfor¬ 
tunately, this involves the product rule, but it’s not too bad. I leave it to you 
to check that 

= — 2 sin 2 (0) + (1+2 cos(0) ) cos (0) and ^ = — sin(0)(l+4cos ( 汉 )), 

au du 

so we have 

dy — dy/dO _ —2sin 2 (0) + (1 + 2cos(0)) cos(0) 
dx dx/dO — sin(0)(l + 4cos(0)) • 

We want to know what happens when 0 = 7r/3 , so plug that in. You should 
get 

dy _ -2(3/4)+ (l + 2(l/2))(l/2) _ 1 

dx ~ — -(V3/2)(l +4(1/2)) — _ 3\/3' 

So we know the slope of the line we’re looking for. Now we just need a point the 
line goes through. That point is obviously (2 ，丌 /3) in polar coordinates, but 
we need it in Cartesian coordinates. So, just use x = r cos(6) and y = r sin(0) 
to get x = 2cos(7r/3) = 1 and y = 2sin(7r/3) = y/3. Great —— we need the line 
through (1, v^3) with slope 1/3^3. That line is given by 

which simplifies a little to the answer we’re looking for, 
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How about the tangent line to the same curve at the origin? Looking at the 
graph of r = 1 + 2 cos(0) on page 588, you can see that there should in fact be 
two tangent lines! We can still find their equations, however. Indeed, we know 
that the curve hits the origin when r = 0, and we saw in the previous section 
that this happens when 6 = 27r/3 or 6 = An/3. Check that substituting these 
values of 0 one at a time into the above equation for dy/dx gives —y/3 and 
\/3, respectively. Since both tangent lines pass through the origin, they must 
have equations y = —y/3x and y = y/3x. In fact, these lines complete the rays 
corresponding to 0 = 2 丌 /3 and 0 = 47t/3, shown as dotted lines in the graph 
(again, it’s on page 588). 

27.2.4 Finding areas enclosed by polar curves 

If we want to find the area enclosed by the polar curve r = f{9), where / is 
assumed to be continuous, then we’re going to have to integrate something. 
But what? We just have to set up the correct Riemann sum. (See Section 16.2 
in Chapter 16 for a review of Riemann sums.) Suppose we take a small chunk 
of angle between 6 and 9 + d6. Then as we move counterclockwise along this 
chunk of angle, r meanders from f(0) to f(0 + d6). If d6 is very small, then 
r doesn’t have a chance to move far away from /(0), so we can approximate 
the wedge we’re looking for by a thin slice of pie of radius r = f(0) units and 
angle d0, centered at the origin, as shown in the following diagram: 





The area of a sector is one half of the radius squared, multiplied by the angle 
of the sector (in radians, of course!). So, we can approximate the area of the 
wedge (in square units) by ^(/(0)) 2 d6, which is just \r 2 d6. The total area, 
as 6 varies from 0o to 6\ is found by adding up the areas of all the wedges and 
letting d6 go down to 0, leading* to the following integral: 


(area inside r = f(0) between 0 = 6o and 8 = 0i) 




As usual, the area is given in square units. 

Let’s try out this formula on the curve r = 3sin(0), where 0 < 6 <n. We 
saw in Section 27.2.2 above that this is a circle of radius 3/2 units, so its area 


*To prove the formula, one needs to set up upper and lower sums for the area by 
considering the maximum and minimum values of f(0), where 6 varies over a subinterval in 
a partition of [0o> ^i], then show that the upper and lower sums converge to the same value 
as the mesh of the partition goes to 0. 










should be 7 t( 3 / 2 ) 2 , or 97 r /4 square units. Let’s verify this. We have 



area = f ^r 2 dO = ^- f (3sin(0)) 2 dO = ^- f sin 2 (6) d6. 

Jo 2 2 Jq 2 Jq 

This integral can be done using the double-angle formulas, as described at 
the beginning of Section 19.1 in Chapter 19. Check that you agree that the 
answer is 97 t/ 4 . 

Here’s a harder example. Let’s try to find the area of the croissant-shaped 
region enclosed by our curve r = 1 + 2 cos( 0 ), as shown in the following 
diagram: 



It seems as if we should just be able to use the formula to say that the area 
we want is given by 



+ 2 cos( 0)) 2 d0. 



Again, to do this integral we need the double-angle formulas. I leave it to you 
to show that 



+ 2 cos( 0)) 2 d9 = 


+ 2 sin( 0 ) H- - sin( 20 ) + C, 


so the above definite integral can be evaluated by plugging in 0 = 27r and 
6 = 0 and subtracting, giving 3 丌 . Unfortunately, this isn’t the correct an¬ 
swer. The problem is that r becomes negative when 6 is between 27r/3 and 
47 t/ 3 . Since the formula for the area involves r 2 , there’s no way to distinguish 
between positive and negative area. (This is very different from the situation 
in Cartesian coordinates, where area below the y-axis is indeed negative.) So 
what we have actually found is the area inside the curve r = |1 + cos( 20 )|, 
which looks like this: 
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To fix up this crappy situation, we need to find the area inside the little loop 
to the left of the vertical axis, and then take it away twice from our original 
area. Why twice? Because taking it away once just gives the rest of the 
shaded area in the previous picture, but we actually want to cut out a little 
loop from the region to get the area we need. So, how do we find the area 
inside the little loop? Just repeat the above integral, except from 2 丌 /3 to 
4-7r/3: 

1 /*47r/3 

area of little loop = - / (1 + 2 cos(0)) 2 d6. 

2 J27T/3 

Now you should use the above antiderivative to show that this integral works 
out to be (7r — 3v^3/2) square units. So, we can finally express the area we 
want as 3 丌 square units minus twice the area of the loop, then work out the 
value of the area: 


area we want = 3 丌 



3^3 

~Y~ 


)=(7r + 3v^3) square units. 


As this example shows, you have to be very careful when using the above 
formula for area in polar coordinates if r can ever be negative. 









CHAPTER 28 


Complex Numbers 

Why should some quadratics have all the fun? The quadratic x 2 — 1 gets the 
privilege of having two roots (1 and —1), but poor old x 2 -\-l doesn’t have any, 
since its discriminant is negative. To even things up a little, let’s introduce 
the concept of complex numbers. Using complex numbers, any quadratic has 
two roots.* (You have to count the double root a of (a; — a) 2 as two roots.) 
Anyway, here’s what we’re going to be doing with complex numbers: 

• basic manipulations (adding, subtracting, multiplying, dividing) and 
solving quadratic equations; 

• the complex plane, and Cartesian and polar forms for complex numbers; 

• taking large powers of complex numbers; 

• solving equations of the form z n = W] 

• solving equations of the form e z = w; and 

• using some tricks from power series and complex numbers to solve some 
series questions. 

28.1 Hip Basics 

It kind of sucks that you can’t take the square root of —1. So, we’ll just do 
it anyway. Let’s just create a square root of — 1 and call it i. OK, so then we 
must have i 2 = —1. Is i the only square root of —1? No, —i should also be a 
square root, since if there were any justice in the world, then 

H) 2 = (-i) 2 (i) 2 = i(-i) = -i. 

(There is in fact justice in the world: this last series of equations is correct.) 
Since i 2 + 1 = 0 and (—i) 2 + 1 = 0, we now have two roots for the quadratic 


*The surprising thing is that this also works for higher-degree polynomials: every poly¬ 
nomial of degree n has n complex roots (counting multiplicities). This is due to the so-called 
Fundamental Theorem of Algebra, but that’s way beyond the scope of this book. You might 
have to look at a book on complex analysis to learn more about this. 
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x 2 1 after all~~but they are not real: they are imaginary. How about 2i? 
That’s also imaginary. In fact, (2i) 2 = 2 2 i 2 = 4(—1) = —4，so (2i) 2 is a 
negative number. So, when we say that a number is imaginary, we mean that 
its square is a negative number. The only imaginary numbers are of the form 
yi where y is a real number not equal to 0. You can also write iy instead of 


◎ 

◎ 

◎ 


yi. 

Now, you can add or subtract real and imaginary numbers, for example 
2 — 3i, but you can’t simplify the result. In this way, we get all the complex 
numbers, which are all the numbers of the form x-\-iy, where x and y are real. 
The set of all complex numbers is normally denoted by the symbol C. Notice 
that all imaginary numbers are complex numbers; for example, 2i = 0 + 2i. 
All real numbers are also complex numbers; for example, —13 = —13 + Oi. 
Every complex number has a real and an imaginary part. If z = x-\-iy, then 
the real part is x and the imaginary part is y. These are written as Re(z) and 
Im(z), respectively. For example, Re(2 — 3i) = 2 and Im(2 — 3i) = —3. Note 
that Im(2 — 3i) is not —3i, ifs just —3. What is Re(2i)? Well, write 2i as 
0 + 2“o see that the real part is 0. On the other hand, the imaginary part, 
Im(2i), is of course 2. 

Adding and subtracting complex numbers is pretty easy. Just add (or 
subtract) the real parts, and then do the imaginary parts. For example, 

(2 - 3i) + (-6 - 7z) = 2 - 6 - 3z - 7z = -4 - lOi; 

an example of subtraction is 

(2 - 3z) - (-6 - 7i) = 2 + 6 - 3i + 7i = 8 + 4i. 

Multiplication isn’t much harder — you just expand, but remember to change 
i 2 into —1 whenever you see it. For example, 

(2 - 3i)(-6-7i) = 2(-6) + 2(-7i) - (3i)(-6) - (3i)(-7i) 

=—12 — 14i + 18i + 21i 2 = -12 + 4i - 21 = -33 + 4i. 

By the way, what is i 3 ? How about i 4 ? i 5 ? Let’s start off with i 3 . We 
have i 3 = i 2 x i = (—1) x i = —i. So i s is just —i. On the other hand, 

1 4 = i s x i = (—i) x i = 1. That is, i 4 = 1. For i 5 , we play the same game: 

1 5 = i 4 x i = 1 x i = i. In fact, because z 4 = 1, we can see that the powers 
of i keep on cycling through 1, i, —1, —i. For example, z 101 = i since i 100 = 1 
(remembering that 100 is divisible by 4). 

How about division? That’s a little trickier, but not much. The technique 
is very similar to rationalizing the denominator. It’s inspired by the following 
observation: if you have a complex number x iy and multiply it by the 
complex number x — iy, you get a real number. When we do the math, we 
recognize and apply the formula for the difference of two squares: 


(x + iy)(x — iy) = x 2 — (iy) 2 = x 2 — i 2 y 2 = x 2 -\-y 2 . 

Now x and y are real, so obviously x 2 and y 2 are as well, and so is their sum. 
If z = x the related number a: — iy is so important that it has a name: 
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it is called the complex conjugate oi x iy and denoted z. For example, if 
z = 2 — 3i, then z = 2 Si, whereas ii z = 7i, then z = —7i. Note that 
the complex conjugate of a real number is the same number. This is because 
you just flip the sign of the imaginary part to take the complex conjugate, 
and real numbers have imaginary part zero. Now as the above formula shows, 
a number multiplied by its complex conjugate is real; it is the sum of the 
squares of its real and imaginary parts. Inspired by Pythagoras’ Theorem 
and the above formula, given a complex number z = x + iy, let’s define the 
modulus of z to be yjx 2 -\- y 2 . We write the modulus of z as |z|. So 

\x + iy\ = ^/x 1 + y 2 . 

Here are some examples: |2 — 3i| = ^/2 2 + (—3) 2 = >/4 + 9 = \/T3. Similarly, 
\7i\ = VO 2 + 7 2 = 7. How about |-13|? This equals ^/(-13) 2 + 0 2 = 13, 
which is exactly the same as the absolute value of —13. Our notation for 
modulus is completely consistent with the previous notation for absolute value. 
In fact, think of the modulus as a beefed-up version of absolute value. Anyway, 
the difference of two squares formula above shows that a complex number 
multiplied by its complex conjugate is the square of its modulus. That is, 

zz = \z\ 2 . 


After all these preliminaries, we are ready to see how to divide complex 
numbers. All you do is multiply top and bottom by the complex conjugate of 
the bottom, then expand. The new denominator becomes the square of the 
modulus of the old one. For example, 

2-3i (2-3z)(-6 + 7i) 

-6-7i = (-6-7i)(-6 + 7i)* 

Now the top needs to be fully multiplied out, but the bottom is just |—6 —7z| 2 , 
so 


2-3i -12 + 18i + 14z-2H 2 9 + 32z 9 32 

^ = 85 + 85*- 


-6-7i 

We can conclude that 


(—6)2+ (-7)2 


Ref^L)=A and 1 
\-6-7iJ 85 

Another example: how would you find 

-(捋> 


m) 


This example contains a slight trick to throw you off guard. The denominator 
should really be written as —1 + i. Once you do this, you can see that the 
complex conjugate of the denominator is —1 — i, so 


3 + 4i 


(3 + 4i)(-l-i) -3 

(-l + i)(-l-i —— 


u- u-u 2 


(一 1)2+ (1)2 


-7i 

2 


= - %. 
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So the real part of (3 + 4i)/(i — 1) is just and as a bonus, its imaginary 
part is — 

Now let’s see how to solve quadratic equations. For example, let’s say that 
you want to solve a; 2 + 3a: + 14 = 0. Just use the quadratic formula and the 
fact that y/—l = 士 S to write 


-3 士 V3 2 - 4 x 1 x 14 -3± V^47 3 , V47 

X = --- = --- = —- ± —T—1. 

2 2 2 2 

Notice that we have simplified 47 as 士 Now, how about if you have 

a quadratic whose coefficients are complex numbers? The quadratic formula 
still works, but you may well have to take the square root of a complex number, 
not just a negative number, as in the example we just did. We’ll look at an 
example of this in Section 28.4.1 below. 


28.1.1 ： ©Drrtplex-eK^fsentials 

We’ve discussed how to add and multiply complex numbers. How about 
exponentiating them? Let’s see how we can make sense of something like e z 
when 2 ： is complex. From Section 24.2.3 in Chapter 24, we know that 


eL 




for all real x. What happens if we replace a: by z on the right-hand side, 
where z is some complex number? We’ll get a series whose terms are complex 
numbers. Believe it or not, you can still use the ratio test to show that the 
series converges, no matter what complex number ^ happens to be. (We only 
proved the ratio test for real series, but it turns out that once you define what 
convergence means for complex sequences, the same proof works.) Inspired by 
all this, we’ll define e z , for any complex number z, by the following equation: 


^ = E 

n=0 


,Z ， 


Certainly this works nicely enough when 之 is real, since the definition agrees 
with the above equation for e x . On the other hand, it would be good to know 
that our new toy, e z ^ does all the nice things that we expect of exponentials. 
Actually, the critical thing it needs to do is to satisfy the exponential rule 
e z e w = e z ~^ w . Once we know that, all the other exponential rules follow more 
or less immediately. 

So, how do we show that e z e w = e z ~^~ w ? Here’s a sneaky way. We know 
that e x e y = e x+y for any real x and y, so this means that 


^ n\ ^ m\ 


(x + y) k 
~fc!~ 


We have just replaced each exponential by its Maclaurin series, using a dif¬ 
ferent dummy variable in each sum. If you multiply out the two series on the 
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left-hand side, you get some double power series in powers of x and y, and the 
same goes for the right-hand side. The coefficients of x n y m on the left- and 
right-hand sides of the equations must therefore be the same. This will also be 
true if x and y are replaced by complex numbers like z and w (respectively), 
so we have proved that e z e w = e z+w for any two complex numbers z and w 
after all! 


28.2 The Complex Plane 

Real numbers are usually represented as points on a number line, which is 
one-dimensional. Complex numbers literally have an extra dimension. In¬ 
deed, if z = x + iy, we can’t squish all the information into just one real 
number. Instead of a real number line, we’ll use a complex number plane. 
The complex number z = x iy will be represented as the point (x,y) in 
Cartesian coordinates. It’s pretty easy to plot complex numbers like 2 — 3i, 
2i, and —1: 


2i 


You should think of each point as representing one complex number, rather 
than as a pair of real numbers. 

In the previous chapter, we saw that you can also express every point in 
the plane in polar coordinates instead. (You should review Section 27.2.1 now 
if you haven’t looked at it for a while.) So suppose you have a point in the 
complex plane which has polar coordinates (r, 6). What is the complex num¬ 
ber represented by that point? Well, we can convert to Cartesian coordinates 
using x = r cos(6) and y = r sin(0). So the point (r, 0) in polar coordinates 
represents the complex number z = x-\-iy = rcos(6)-\-ir sin(6). In particular, 
if r = 1, then : is just cos(0) 

Now, there’s a pretty bizarre and funky identity, due to Euler, which is 
really important: 

I e l6 = cos(0) + i sin(0). I 
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28 . 2.1 

◎ 


◎ 


have found the polar coordinates (r, 6) for your point, then you could add any 
integer multiple of to 6 and it wouldn’t make a difference. For example, 
the point (0, —1) has polar coordinates (1 ， 3 兀 /2)， or you can subtract 27r to 
see that it also has polar coordinates (1, —7r/2). In terms of complex numbers, 
this means that e 2 ( 3?r / 2 ) = e 一 l7r 〆 2 . So e 10 is periodic in 6 with period 27r. 
This is an important fact which will come in handy a little later. 

We’ve just seen above that e l7r = —1. Let’s just reflect on this for a mo¬ 
ment. It’s really quite awesome, when you think about it. What have been 
the fundamental new numbers in your math education so far? Introducing 
the number —1 opens the door to negative numbers. The number 7r arises 
from the geometry of circles. The number e is the natural base for logarithms 
and is fundamental in the study of calculus. And the number i leads the way 
to complex numbers and being able to solve quadratic (and higher-degree 
polynomial) equations. The fact that they are combined into such a simple 
formula is pretty remarkable, if you ask me. Anyway, enough of this philo¬ 
sophical rambling: let’s look at some examples of how to convert complex 
numbers from polar to Cartesian form and vice versa. 

❹ ‘ 檐 . and f_*ti polar form 

To convert a complex number from polar to Cartesian form, just use Euler’s 
identity directly (that’s e l9 = cos(0) + i sin(0) in case you have already for¬ 
gotten!). For example, what is 2e 2 ( 57r / 6 ) in Cartesian form? Well, Euler’s 
identity says that it is 2(cos(57r/6) + isin(57r/6)). See why you need to know 
your trig? Hopefully you can work out that cos(57t/6) = —y/3/2 and that 
sin(57r/6) = 1/2, so we have 

2e i(57r/6) = 2 (cos ( 旱 ) +isin (旱 ))= 2 = -Vs+i. 

On the other hand, converting from Cartesian to polar form is more diffi¬ 
cult, as we observed in Section 27.2.1. There we saw that 

r = ^x 2 +y 2 and tan(0)=—, 
x 

where we have now dropped the possible solution r = -^/x 2 -\-y 2 since we 
want r > 0 for complex numbers. By the way, we defined the modulus of z 
to be |:| = \fx 2 -\- y 2 . So r is the same as |z|. The modulus \z\ is therefore 
the distance from the point z to the origin (in the complex number plane). 
The angle 6 is called the argument of z and is written arg(z). (Normally one 
requires that 0 < arg(z) < 2 丌 so that there’s no ambiguity.*) 

So, to convert z from Cartesian to polar coordinates, we just have to find 
the modulus and argument of 么 ， using the above formulas. (In fact, sometimes 
the polar form of ^ is referred to as mod-arg form.) For example, how would 
you convert z = 1 — i into polar form? Well, think of ^ as being written as 


! Often this condition is replaced by —tt < arg( 2 ：) < 7r instead. 










Now we see that 6 = 7r (or if you prefer, —7r, or even 37r, or any odd multiple 
of 7r). So, we have —6 = 6e Z7r . Incidentally, if we divide by 6, we get the 
amazing formula e Z7r = — 1 , which we discussed in the previous section. 


26.3 Taking Large Powers of Complex Numbers 



Why on earth would you want to use the polar form? One reason is that it’s 
really easy to multiply and take powers in polar form. Imagine you wanted to 
multiply 3e 饥 / 4 by 2e 一 z ( 3?r / 8 ) • This is pretty simple — you just use the normal 
exponential rules (see Section 9.1.1 in Chapter 9) to write 


(3e i7r/4 )(2e~ i(37r/8) ) = 6e i(7r/4 ~ 37r/8) = 6e~ i7r/8 . 


Even better, imagine you want to raise 3e Z7r / 4 to the 200th power. This is just 


(3 e i7r/4)200 = 3200 e (i7r/4)x200 _ ^200^(50^) 



In fact, by Euler’s identity, e^ 507r ^ = cos(507r) + isin(507r). Since 507T is an 
integer multiple of we have cos( 507 t) = 1 and sin(507r) = 0, so we have 
proved that (3e i7r / 4 ) 200 = 3 200 . 

A lot of the time, you might want the final answer in Cartesian form. 
For example, suppose we’d like to compute (1 — i ) 99 and give the answer in 
Cartesian form. Expanding the expression by multiplying out would be crazy, 
so we won’t go there. The correct way to proceed is to translate 1 — i into 
polar form, take the 99th power, then translate back into Cartesian form. 
OK, we saw in the previous section that 1 — i = \/^( 7 々 4 ) in polar form, so 
we have 


Now, we have to go back to Cartesian form. Before we do this, let’s look 
at e l ( 6937r / 4 ). This fraction 6937r/4 is a bit of a pest. Remember that e l9 is 






m • 
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Complex Numbers 

27r-periodic in however, so we can knock off any multiple of 2 丌 from the 
fraction 6937 t/ 4 and not affect the answer. So, write 693/4 = 173^. The 
biggest even number less than this is 172, and the difference between these 
two numbers is 173^ — 172 = 5/4. So, we can think of 6937r/4 as 1727r + 57r/4. 
Since 172 丌 is a multiple of 27r (this is why we wanted an even number, 172 
in this case), we know that e *( 693?r / 4 ) = e *(57r/ 4 ). That’s much nicer. Now we 
can convert the whole thing to Cartesian form: 

(1 - i) 99 = 2 99/2 e i(69 37 r/4) = 2 99/2 e i (57r /4) = 2 99/2 ^ Qs 
= 299/2 

In fact, this can be further simplified by writing \/y/2 as 2— 1 / 2 ; the final 
answer should be —2 49 (l + i). Now, as an exercise, you should check that you 
can arrive at the same answer by starting off using an alternate polar form, 
l-i = - i7r / 4 . 

In summary, to take a large power of a complex number, first convert it to 
polar form, then take the power. Find the largest even multiple of 7r less than 
the angle 0, and take that away from 6 and replace 6 by that new number. 
Finally, convert back to Cartesian form. 

Solving z n = w 

Let’s move onto a trickier subject: how to solve equations of the form z n = w, 
where n is a given integer and w is a given complex number. This amounts 
to taking nth roots of w, but we don’t just want to say z = y/w since that 
doesn’t tell us very much. Instead, we’ll try to find a solution directly. Since 
powers work so well in polar form, that’s what we’ll use. 

For example, to solve z 5 = — V3 + i, we should use polar coordinates for 
both z and w = —Vs + i. Since we don’t know what z is, let’s put z = re l °. 
Now to find z, we just have to find what r and 0 are. As for w, let’s write 
—-\-i = Re l(p and then find R and (p. (We have to use R and cp instead of 
r and 6 since the last two variables are already taken for z.) Now, let’s draw 
a picture of the situation: 



(¥)—(?)) 
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So we have R = (—v^) 2 + ⑴ 2 = 2, and tan(p) = —1/\/3. Since the 

point is in the second quadrant, (p must be 57 r/ 6 . Great, so we know that 
-y/S + i = 2 e*(W 6 ) in po lar form. _ 

Now let’s turn our attention to the equation z 6 = —-\/3 + i and convert 
the whole thing to polar form. On the left, we replace z by re l6 to get 
z 5 = (re t0 ) 5 = r 5 e z ( 50 )，whereas we’ve just seen that the right-hand side is 
2 e «(W 6 ) So our equation becomes 

r 5 e i(50) = 2e i(57r/6). 

If you take the modulus of both sides, you get r 5 = 2 (because the modulus 
of e lA is always 1 if A is real). Then we can cancel out r 5 and 2, since they 
are equal, to get e< 50 ) = e l ( 5?r / 6 ) • We have dissected the above equation into 
two separate equations: 

r * 5 =2 and e _ =e ， / 6 ). 

The first is easy to solve: just take the 5th root to get r = 2 1 / 5 , which is 
legit since r is a nonnegative real number. As for the second equation, you 
may be tempted to say 56 = 5 丌 /6, but it’s not that simple. Remember, e l ° 
is 27r-periodic in the variable 6\ You can express this fact via the following 
important principle, which I want you to remember better than you’ve ever 
remembered anything before: 


If e lA = e lB for real numbers A and B, then 
A = B 27rfc, where k is an integer. 


This principle saves the day. Since e< 50 ) = e l ( 5?r / 6 )，we use the principle to 
see that 

50 = + 27rfc, 

6 

where k is an integer. Dividing by 5, we have 



So it looks as if there are infinitely many values of 6, and therefore infinitely 
many values of z that solve our equation. Appearances can be deceptive, 
however! You see, since n = 5, you only need to use the first five values for 
k, namely, fc = 0, 1,2,3, 4. We’ll see why in just a moment; for now, we can 
calculate that as k goes from 0 through 4 inclusive, the values of 9 are 


6 
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respectively. Putting these values of 0, along with r = 2 1 / 5 , into the equation 
z = re ze , we get 

7 = ol/5 ztt/ 6 9 l/5 z(17tt/ 30) ? 1/5 i(297r/30) ? 1/5 2(41 tt/30) 


Of course, it would be nice to change these to Cartesian form. The first 
solution is pretty easy: 


(X)H 1/5 


i-\=2~ 4 / 5 (V3 + i). 


As for the others, they don’t look too nice. For example, the second solution 
from the above list works out to be 


which can’t easily be simplified. (Do you know what cos(177r/30) is? I don’t 
either, and it’s not worth working out.) I leave it to you to write out the other 
three solutions in (unsimplified) Cartesian form. 

Now, let’s see why you only need to let k go from 0 through 4, discarding 
all the other possible values of k. Let’s see what happens when k = b. Using 
the equation 

^ 7r 2irk 


from above, we see that when k 


This is certainly a different value of 9 from any of the ones we already listed 
above, but it doesn’t lead to a different value of z. Why? Because 


That is, we get the same solution as the case A; = 0. Similarly, if you try to 
put k = 6, you should get the same value of z as when fc = 1. In general, 
any time you increase k by 5, you will simply get the same value of z again. 
So, the values A: = 0,5,10,..., as well as fc = —5, 一 10, —15, …， all lead to 
the same solution, z = 2 1 ^e l ^ 7r ^ 6 \ Similarly, the values k — 1,6,11,... and 
k = —4, —9, —14,... give the same solution. The same goes for the other 
three solutions. While you need to appreciate this fact, in practice it is simple 
to apply: unless w = 0, the equation z n = w has n different solutions, which 
occur when k = 0,1, …， n — 1. Those are the only values of k you need to 
use. In our case n = 5, so we only needed fc = 0, 1, 2, 3,4. 

It’s interesting to plot the solutions in the complex plane. They all have 
modulus 2 1 / 5 , which means that they lie on the circle centered at the origin 
of radius 2 1 〆 5 units. Also, the difference between the arguments (that is, 
values of 0) of consecutive solutions is 2 兀 /5, which is one-fifth of a complete 
revolution. This means that the solutions are evenly spaced around the circle; 
that is, they form a regular pentagon (the solutions are labeled zo through 

么 4): 









only solution 
main steos ir 


4. Decompose into two equations: r n = R and e inu = d 

5. The first is simple to solve: take nth roots to get r = R}’ n . 

6. For the second, use the above triple-boxed principle to get n6 = ip-\-27rk, 
where k is an integer. 

7. Divide this by n, then write out all the different values for 6 when 

= 0, 1, 2, ... ， n — 1. 

8. Substitute the value of r and the different values of 9 into z = re ld to 
get n different values for z, which are the solutions. 

9. If necessary, change each and every one of those solutions into Cartesian 



Let’s look at one more example: what are the cube roots of %l This 
question is asking us to solve the equation z 3 = i. We start off by writing 
0 = re ld , so z 3 = r s e 1 ^ (step 1). Now, we have to convert i into polar 
coordinates (step 2)，but we have already seen above that i = e 27r / 2 . So, 
since z s = i, we have r 3 e 2 ( 30 ) = le Z7r / 2 (step 3). This leads to the equations 
r 3 = 1 and e 2 ( 3 沒 ） =fsteD 4). Takiner cube roots in the first eauation 
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gives r = 1 (step 5), and our important principle and the second equation 
show that 30 = 7r/2 + 27rk, where k is an integer (step 6). This is the same 
a,s 0 = 丌 /6 + 27r/c/3; since n = 3 in this question, we only need k = 0, 1,2. 
Writing these out, we have 

_ 7T ( 7T 2?r\ 57T ( 7T Att\ Sn 

e= e' = or U + jJ = y 

(step 7). This leads to three possibilities for z, which are 

z = e i7r /6, e i(57r/6) or e «(37r/2) 

(step 8). Finally, we should convert these into Cartesian form (step 9). The 
first solution is 

… W6=cos G)+ 乂 sin (i) = ^+4 

The second solution is 

0 = gi(57r/6) =cos (¥) + jsin (¥) = + 

Finally, the third solution is 

e *( 37r / 2 ) = cos ( 警 ) + ^sin (^) = 0 — S(l) = —i. 

Let’s plot these three solutions and check that they do indeed form an equi¬ 
lateral triangle: 


參 

-¥ • 

i 

^3 


28.4.1 Some variations 



Suppose you want to solve the equation (z — 2) 3 = i. No problem ― just let 
Z = z — 2, so that the equation is Z s = i. Solve this exactly as we just did at 
the end of the previous section to find that 


Z = z — 2 
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These are the four solutions to 之 4 — z 2 + 1 = 0. It follows that we can factor 
z 4 — z 2 + 1 as follows: 

-V3-A 

卜丁). 

This is the complex factorization. To get the real factorization, we need to 
use a nice fact: if w is any complex number, then (z — w)(z — w) has real 
coefficients when you multiply it out. Indeed, you get z 2 — (w w)z + ww, 
but it’s easy enough to see that w -\-w = 2Re( , u;) (which is real), and we’ve 
already seen that ww = \w\ 2 , which is also real. Anyway, notice that I have 
cunningly grouped the above four factors so that if we multiply out the first 
two, we get 



字)卜明 

= ^_(辛 + 年卜(辛)(年) 


= z 2 — V^z + 1 . 



you should check that multiplying out the last two factors gives 
+ 1. The conclusion is that 


z 4 - 之 2 + 1 = p 2 - + l)(z 2 + \/3^ + l). 


Notice that there are no complex numbers here, yet working this out without 
them would have been pretty darn tricky. 


28.5 Solving ： e z = w 



Now it’s time to see how to solve equations of the form e z = w for given w. 
It’d be nice if we could just write 2； = but this isn’t very helpful. For 

example, what exactly is ln(—+ i)? Let’s try to answer this question. 

Fortunately, solving e z = w isn’t much harder than solving z n = w\ in fact, 
if anything, it’s simpler. Before we see how to do this, we need to understand 
e z a little better. Let’s see what happens if we write z = x-\-iy. We get 

e z = e x+iy = e x e iy . 

So what? Well, the main point is that this is already in polar form. The 
modulus is e x and the argument is y. If you prefer, r = e x (remember, e x 
is real and positive) and 6 = y. This means that if : is in Cartesian form 
x + iy, then e z is automatically in polar form: e z = e x e iy . So, the main 
difference between solving e z = w and z n = w is that you don’t need to put 
^ in polar form in the first case, whereas you do in the second case. A sort of 
by-product of this is that there are infinitely many solutions to the equation 
e z = w (unless = 0, in which case there are no solutions). 

Let’s solve e z = —y/3 + i. We have already converted the right-hand side 
to polar coordinates as 2e*( 57r ’ 6 ) (see page 604). To handle the left-hand side, 
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write z = x iy in Cartesian coordinates, so e z = e x e iy . So, changing the 
original equation to polar form, we get 

= 2e i(57r/6) . 

Now, this separates into two equations: 

e x = 2 and e i2/ = e i(57r/6) . 

To solve the first equation, we have to take logarithms to see that x = ln(2). 
The second is handled by our important principle to get y = 5丌/6 + 
where k is an integer. Finally, putting these values into z = x we get 

z = ln(2) + % (字 + 2 冗众 

where k is any integer. In this case, we do get a different value of z for each 
value of A:, so we need to use them all. Let’s plot some of the possible values 
of z corresponding to k = —2, —1, 0, 1, and 2 (I’ll use a different scale on the 
axes for clarity): 




.•177T 
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So the solutions are equally spaced on the vertical line x = ln(2). Incidentally, 
this means that they form an arithmetic progression of complex numbers. 
Although only five solutions are shown in the above picture, you should bear in 
mind that there are actually infinitely many solutions to the original equation 
e z = -V3 + i. 

Let’s look at one more example. Suppose you want to solve e 2zz+3 = i. 
The exponent 2iz + 3 makes this a little more complicated than the previous 
example, but it’s not too bad. We’ve already seen that the right-hand side in 
polar coordinates is e l7r 〆 2 , but how about the left-hand side? Once again, we 
write z = x-hiy, but now we need 2iz-\-Z = 2i(x-\-iy)-\-3 = (-2y-\-3)-\-i(2x). 
So, the polar form of the left-hand side is given by 

e 2iz-^S = e -2y+S e i(2x)^ 







61 2 • Complex Numbers 

Notice how the factor of i switched the real and imaginary parts (and also 
the sign of y). Anyway, translating our equation e 2lz+s = i into polar form, 
we have 

e -2y+S e i(2x) = le i7T/2 > 

This leads to the equations 

e- 2 奸 3 二 1 and = e^ 2 . 

To solve the first equation, take logs to get —2y + 3 = ln(l) = 0, so y = 

To solve the second equation, use the boxed principle to get 2x = 7t/2 + 2 丌石 , 
where k is an integer. This means that x = 丌 /4 + nk, so since z = x iy, we 
have 

7T 3 . 

z = - +7r k + -t, 

where k is an integer. Let’s plot what these solutions look like for k = —2, 
—1, 0, 1， and 2, bearing in mind that these are only five of the infinitely many 
solutions: 


• • \i 

- • • • 
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Once again, the solutions are in arithmetic progression, but this time they lie 
on the horizontal line y = |. 


28.6 Som©Trigonometric Series 


A trigonometric series is a series of the form 

(a n cos(n0) + b n sin(n0)) 

n=0 



for some coefficients {a n } and {b n }. In this section, we’ll see that there are a 
few such series which can be simplified. 

For example, consider the trigonometric series 

^ sm(n0) 

^ n! ’ 

n=0 


where 0 is real. Note that this is not a power series in 0, since sin(n0) is not 
a power of 0. On the other hand, we can make the whole thing into a power 
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series by using the complementary series 

cos(n0) 


E: 


n! 


in a clever way. In fact, we can find both series at once. The key is Euler’s 
identity. Watch carefully, because this is a sneaky trick. Let’s find both series 
at once by combining them like this: 


E 


cos(n0) 

n! 


-E 


sin(nff) 
n!. 


OK, so this is one series plus i times the other. So what? Well, by massaging 
the sums* and then using Euler’s identity, this simplifies to 


E 


cos(n0) 


^sin(n(9) ^ e in0 

一 


Finally, use the exponential rules to write e in6 as (e^) n ; the sum becomes 


E 


(e ie ) n 

n\ * 


Now, the last sum looks familiar. In fact, we saw in Section 28.1.1 above that 


E 



for all complex numbers z. Now we just have to substitute 2； = e l ° to get 


E 


(e i0 ) n — 
n! 


If you’ve been following this chain of reasoning, you should see that we’ve 
proved that 


E 


cos(n0) 

n\ 


,sin(n0) 


Now what? Well, we need to convert the right-hand side into Cartesian form. 
To do this, write e l ° = cos(0) + isin(0), so 

e e id _ e cos(0)+isin(0) _ e cos(0) e isin ⑻. 


This is a good start — this is the polar form of e e% °. To get the Cartesian 
form, we need to convert e* sm ( 0 ) into cos(sin(0)) +isin(sin(0)). Putting it all 
together, we get 


g £^) + .g ^1= e cos W cos(sinm + ie cos W sin(sinm 


*This needs some justification. It turns out that everything’s OK because both our 
series converge absolutely. 
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Now, if two complex numbers are equal, then their real parts must be equal, 
and also their imaginary parts must be equal. This leads to the following two 
equations, which are valid for all real 6: 

⑽二。 =e cos ⑼ cos(sin ⑻） and = e c _) sin(sin(0)). 

n=0 • n=0 • 



Not easy, but this is basically what you have to do. I’ll do one more example, 
without all the explanations. Your task is to follow this and explain each step. 
The example is to find 

cos(nO) ^ sin(n0) 

and 

n=0 n=0 


Following the pattern of the above example, we have 


E 


cos(n0) 
~ 3 ^~ 




sin(n0) 
~ 3 ^~ 


cos(n0) + i sin(n6) 

h ^ 



Now this is a geometric series with ratio e 10 /S. This last number is in polar 
form with modulus 1/3, which is less than 1; so the geometric series should 
converge. By the formula for the sum of a geometric series (see Section 23.1 
of Chapter 23)，we have 





We now have the wretched task of converting this into Cartesian coordinates. 
First, try it and see if you can do it. If not, at least try to understand the 
following steps: 

1 _ 1 
1 — ^e i6 1 — I cos(0) _ sin(0) 

1 1 — 1 cos(0) + i\ sin(0) 

1 — I cos(0) — sm(6) 1 — I cos(0) + sin(0) 

1 — I cos(6) + sin(0) 

(1 - i cos{6)f + (i sin(6»)) 2 
1 — I cos(0) H- sin(6) 

1 - 誉 cos(0) + * cos 2 (0) + I sin 2 (0) 

1 — I cos(0) + i\ sin(0) 

1 - 暑 cos(0) + I 
_ 9 — 3 cos(0) + i3 sin(0) 

10 — 6 cos(6) 

_ 9 — 3 cos(0) . 3 sm(0) 

10-6 cos(0) + Z 10 - 6 cos(0)' 
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After all this, we’re ready to write 

cos(nO) . ^ sin(n0) _ 9 — 3 cos(0) . 3 sin(0) 

^ 3 八 + 士 3 n = 10-6 cos(6>) + 2 10 - 6 cos(6>); 

since the real and imaginary parts must be equal, we conclude that 

子 cos(n6) _ 9 — 3 cos(0) ^ sin(n0) _ 3 sin(0) 

3 n 10 — 6 cos(6) an 3 n 10 — 6 cos(6) 

for any real number 0. As you see, these problems are quite hard! 


28.7 1 Euler’s Identity and Power Series 


Let’s finish the chapter with a justification of Euler’s identity 
e l ° = cos(6) + i sin(0) 

using power series. By the definition of e z from Section 28.1.1 above, with 
replaced by iO^ we see that 


e ie 




+ id 


(i0) 2 


一 ㈣ 3 ㈣ 4 {ief {ief 

^ 2! ^ 3! 4! 5! 6! 

e 2 e 3 e A e 5 e 6 e 7 

2! _i 3! + 4! +i 5! ~ 6! ~*7! + "' 


m 7 

7! 


Since the powers of i keep cycling through the values 1, i, —1, —i, we conclude 
that the even powers in the above series all have real coefficients, whereas 
the odd powers all have imaginary coefficients. Furthermore, every second 
even-power term is negative and the others are positive; the same is true for 
the odd powers. So, the real part of e l6 is 


e 2 e 4 e 6 

- ¥+4! -6! + -" = COsW ' 


and the imaginary part is 


sin(0). 


(See Section 26.2 in Chapter 26 to refresh your memory about these Maclaurin 
series.) From this last equation, it follows that e l6 = cos(0) 







CHAPTER 2» 


Volumes, Arc Lengths, and Surface Areas 


We have used definite integrals to find areas. Now we’re going to use them to 
find volumes, lengths of curves, and surface areas. For volumes and surface 
areas, we’ll pay special attention to solids which are formed by revolving a 
region in the plane about some axis which lies in the plane; such solids are 
called solids of revolution. In the case of volumes, we’ll also look at some more 
general solids. Here, then, is the game plan for this chapter: 

• finding volumes of solids of revolution using the disc and shell methods; 

• finding volumes of more general solids; 

• finding arc lengths of smooth curves and speeds of parametric particles; 
and 

• finding surface areas of solids of revolution. 

.29.1 Volumes of Solids of Rev©1u 翻 f>. 

We’ll start with finding volumes of solids of revolution. The idea is that there 
is some region in the plane, and some axis also in the plane, and a solid is 
formed by revolving the region about the axis. For our purposes, the axis 
will always be parallel to the x-axis or the y-axis. (It is possible to deal 
with diagonal axes, but it’s a real pain unless you use techniques from linear 
algebra.) 

Before we put on our 3D glasses, however, let’s remind ourselves how 
definite integrals work. We originally looked at this in Chapter 16, but here’s 
a quick review of some of the main ideas. Let’s work in the context of finding 
the area of the region below the curve 

y = V 1 ~ ( x ~ 3) 2 

and above the a:-axis. What does this look like? Well, if we square the 
equation and rearrange, we get (a: — 3) 2 +y 2 = 1; the graph of this relation is 
the circle of radius 1 unit centered at (3,0), so our function is the top half of 
the circle: 
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y = \/1 - {x - 3) 2 


By the definition of the definite integral, we know that the shaded area (in 
square units) is 


y/l — (x — 3) 2 dx 


which can also be written as J 2 4 y dx. 

On the other hand, to find the area of this semicircle using a Riemann 
sum, we have to chop the base on the a>axis into little segments, then build 
the segments up into strips. The strips don’t have to have the same width, 
and the only thing you need to make sure of is that the top of each strip cuts 
the curve somewhere (or touches the curve at one of its corners). The total 
area of the strips can easily be worked out, since it is just the sum of areas of 
rectangles. This area is an approximation for the actual area of the semicircle; 
the thinner the strips, the better the approximation, as you can see: 



2 3 4 I 2 3 4 


Let’s just check out one generic strip. To make things a little easier, we’ll 
assume that the top left-hand corner lies on the curve. As we’ve seen in 
Section 16.4 of Chapter 16, it doesn’t matter which strips you choose, as long 
as the tops of all the strips pass through the curve. Anyway, here’s what one 
strip looks like: 




d.r 





Since this rectangular strip has base length dx units and height y units, its 
area is y dx square units. Now all we have to do is add up the areas of all 
the little strips, while simultaneously letting the maximum strip width tend 
to zero. The beauty of the notation is that you can accomplish both simply 
by putting an integral sign in front of the strip area and using the correct 
bounds. In our example, x lies in the interval [2,4], and the area of one little 
strip is y dx square units; so the area of all the strips, in the limit as the 
maximum strip width goes to zero, is y dx square units. 
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So here’s the pattern: we make a little strip of width dx units and height 
y units at position x on the x-axis, work out its area, then put a definite 
integral sign in front to get the total area we’re looking for. This technique 
doesn’t just work for areas — it also works for volumes. In particular, let’s see 
how it works using two different methods for finding volumes of revolution: 
the disc method and the shell method. 

29.1.1 The disc method 

Suppose that we revolve the semicircle from the previous section about the 
x-axis. This will give us a sphere. (Can you see why?) Let’s try to work out 
its volume. We’ll start with one strip, just like in the picture at the end of 
the previous section, and revolve that strip about the ar-axis. Here’s what we 
get: 



dx 


This is a thin disc of width dx units and radius y units. Think of it as a 
cylinder on its side; the radius is y units and the height is dx units. Since the 
volume of a cylinder of radius r units and height h units is nr 2 h cubic units, 
the volume of our thin disc is Try 2 dx cubic units. So, now we take a number 
of strips so that their bases form a partition of our interval [2,4], and revolve 
them all about the : r-axis. For example, if you use five strips, you might get 
something like this: 



As perfect spheres go, the above object is pretty crappy, but its volume is a 
decent approximation to the sphere’s. And the thinner the discs you use, the 
better the approximation. In the limit, as the maximum disc thickness goes 
down to zero, the approximation becomes perfect: the total volume of the 
discs tends toward the volume of the sphere. Again, the idea of “adding up 
all the volumes while letting the maximum disc thickness go down to zero” is 
realized simply by taking the volume of an arbitrary disc (ny 2 (fe cubic units) 
and integrating over the interval we want. In our case, y = y 1 — (a: — 3) 2 















and x goes 



poppy seedsJ. Let s approximate 
time we’ll revolve each strip ab< 
saw before, a typical strip looks 


When you revolve it about the i 
drical shell: 



We’re going to approximate our 
letting the maximum shell thicki 
five strips to approximate the re 
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This weird solid is a pretty lumpy bagel half, but its volume is fairly close to 
what we’re looking for. The thinner the maximum shell thickness, the better 
the approximation. As before, integrating takes care of both the addition of 
all the shell volumes and also taking the limit as the maximum shell thickness 
goes to zero. 

First we need to find the volume of one generic shell. The easiest way to 
do this is to think of the shell as a really thin metal can without a top or 
bottom. As you can see from the picture of the shell on the previous page, 
the height of the can is y units, the radius is x units, and the thickness is dx 
units. Imagine cutting the can down the side with some sharp scissors, then 
unfolding it and flattening it out into a thin rectangle-like piece of metal. It’s 
not actually a rectangle, of course. You see, a rectangle is a 2-dimensional 
object, whereas the unrolled can is 3-dimensional — although the can is pretty 
thin, it still has some thickness. (Even a piece of paper has some thickness, 
or else a ream of paper would be really really thin.) Now it’s actually not 
even a rectangular prism, since the inner radius of the can isn’t exactly the 
same as the outer radius. But the point is, it’s almost a rectangular prism. 
The thinner the can gets, the closer it is to a rectangular prism, and when we 
take limits in the end (using the integral), everything will work out.* So, the 
idealized version of the unfolded can looks like this: 


2ttx 


The thickness is dx units, and the side we cut along is still the height of the 
cylindrical shell, that is, y units. How about the long side? Well, that is equal 
to the circumference of the shell (think about it!) which is 2 丌 $ units, since 
the radius of the shell is basically x units. So, the volume of the shell is very 
close to 2nxy dx cubic units. Now all we have to do is integrate from x = 2 
to $ = 4 to see that the volume of the bagel half (in cubic units) is 


j 2nxy dx = 2 tt j xy/l — (pc — 3) 2 dx. 

Great — we’ve now reduced the problem to evaluating a definite integral, but 
it’s a bit of a messy one. Start off by substituting t = x — 3, so dt = dx\ also, 
when a: = 2, we have t = —1, and when a: = 4, we see that t = 1. So in i-land, 
the integral becomes 

2 丌 / (t + 3)y/l — t 2 dt = 2n 


*More formally, we can view the volume of the shell as the difference in volumes of 
the outer shell (of radius x -\- dx units) and the inner shell (of radius x units). Both shells 
have height y units, so the volume of the shell is ny{{x + dx) 2 — cc 2 ), which simplifies to 
2nxy dx + ny(dx) 2 cubic units. When this is integrated, the second term vanishes due to 
the negligible quantity (dx) 2 . 
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The first integral could be done by substituting u = 1 — t 2 , and the second 
could be done by a trig substitution. A better way to do them is to note that 
the first integral is actually equal to 0, since the integrand is an odd function 
oft and the region of integration [—1 ， 1] is symmetric about t = 0. (We proved 
this shortcut at the end of Section 18.1.1 of Chapter 18.) Furthermore, the 
easiest way to do the second integral (ignoring the factor of 3 out front for the 
moment) is to realize that it’s equal to the area in square units of a semicircle 
of radius 1 unit, which is 丌 /2. So without too much work, we see that the total 
answer is 3 冗 2 , therefore the volume of the bagel half is 3 丌 2 cubic units. The 
method we just used is, unsurprisingly, called the shell method (also known 
as the method of cylindrical shells). 

29.1.3 Summary ... and variations 

So far we have seen how to use the disc and shell methods in the special 
case of our semicircle. The same method works for general regions which are 
contained between a curve, the a;-axis, and two vertical lines: 





By the same reasoning that we used above in the special case of the semicircle, 
we can arrive at the following principles: 

• If you revolve the area under the curve y = f(x) between x = a and 
x = 6 (as shown above) about the a:-axis, then the disc method applies 
and the volume is equal to 


f 


Try 2 dx 


cubic units. 


• If you revolve the area under the curve y = f{x) between x = a and 
x = 6 (as shown above) about the y-axis, then the shell method applies 
and the volume is equal to 


f b .. 

/ 2nxy dx cubic units. 

J a 

It’s not a bad idea to know these formulas by heart, but it’s an even better 
idea to be able to derive them by knowing how to find the volume of a typical 
disc or shell. This will be especially useful if you encounter one (or more) of 
the following variations: 

1. The region to be revolved might lie between a curve and the y-axis 
(instead of the a:-axis). 

2. The region to be revolved might lie between two curves, instead of just 
being a region under a curve down to an axis. 
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3. The axis of revolution may be parallel to the x-axis or y-axis, not the 
axis itself. 



Any combination of these cases can be handled by taking a typical strip and 
revolving it appropriately, then integrating; before we see how, it’s important 
to know how to decide whether to use the disc method or the shell method. 
Notice that when you use the disc method, the strips are revolved about an 
axis parallel to their short sides; whereas when you use the shell method, the 
strips are revolved about an axis perpendicular to their short sides. That is, 
after you carve up the region into little strips, then: 

• if the really thin bit of each strip is parallel to the axis of revolution, 
the disc method applies; 

• whereas if the really thin bit of each strip is perpendicular to the axis 
of revolution, the shell method applies. 


Armed with this knowledge, we can now look at our three variations one by 
one. 


薄 .1,4. Variation 1 : regions between Q curve andftnQ y-axis 

If the region is between the curve and the y-axis, you probably want to take 
strips lying on their sides, with the thin part of the strip along the y-axis: 






We actually did the same thing when we saw how to find the area of a re¬ 
gion bounded by some curve and the y-axis, way back in Section 16.4.3 of 
Chapter 16. In any case, suppose that you want to find the volume of the 
solid formed by revolving this region about the y-axis. You should use the 
disc method, since the thin side of the strip is parallel to that axis. A typical 
strip at position y has width dy and length x units, so the resulting disc has 
volume 7tx 2 dy cubic units. When you integrate this to find the total volume, 
be very careful that the limits of integration are relevant points on the y-axis, 
not the x-axis, since the integral is taken with respect to y (because of the 
dy). In particular, we need the integral to go from A to B, not a to 6 (see the 
above diagram), so the volume we want is 7rx 2 dy. 

There’s another way to look at this. Look at the above picture and rest 
your head on your right shoulder. The y-axis becomes horizontal, but ev¬ 
erything^ back to front, so try to visualize what would happen if the page 
were transparent and you looked at the diagram in reverse (still with your 
head tipped over). Now the y-axis and : r-axis have switched places! This sug¬ 
gests that you can just switch the variables x and y wherever you see them, 
provided that you also make the bounds of integration refer to points on the 
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y-axis. Indeed, if we do this to our formula V = 7ry 2 dx from Section 29.1.3 
above, we see that the volume of a region down to the y-axis revolved about 
the y-axis is nx 2 dy, which agrees with what we’ve seen above. 

How about if the above region is revolved about the x-axis, instead of the 
y-axis? Simply adapt the shell formula V = 2nxy dx from Section 29.1.3 
above to see that the volume we want is 2nyx dy. This makes sense, since 

S revolving a typical strip about the a:-axis gives a shell with thickness dy, height 
x, and radius y units. You should draw what happens when you unfold such a 
/ ^ strip into a thin shape which is approximately a rectangular prism, calculate 

its volume, and see that you do indeed get 2wyx dy. In summary, then, the 
rule of thumb is this: 


If the region lies between a curve and the 沒 -axis，switch x and y. 



As always, drawing a typical strip, revolving it, calculating the resulting vol¬ 
ume, and integrating is the most reliable way; the above rule of thumb is just 
a guide. 

Here’s an example of Variation 1. Let R denote the region between the 
curve y = y/x, y = 2, and the y-axis: 


2 



4 


y = 


Let’s work out the volume of the revolution of R about the y-axis and also 
about the : r-axis. In the first case, we use the disc method, since the region 
lies between the curve and the y-axis, and we’re revolving about the same 
axis. The volume is then 

7rx 2 dy. 

Since y = y/x^ we have x = y 2 , so x 2 = y 4 . This means that the volume is 




32tt 

~5~ 


cubic units. On the other hand, the volume of revolution of R about the 
ar-axis is done by shells, and we see that it is 


加 yx dy = 2ir / y 3 dy, 


since yx = y x y 2 = y s ; check that this works out to be Sn cubic units. Please 
make sure you can draw a typical strip in each case and justify the above 
formulas. Also note that the integrals must go between 0 and 2, not 0 and 4: 
after all, the integration is with respect to y (not a:!) and the relevant y-range 
is [0,2], as can be seen on the above graph. 
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29.1.5 Variation 2: regions between two curves 

Suppose the region to be revolved lies between two curves. We’ll handle this 
situation in the same way as finding the area of a region between two curves 
in Section 16.4.2 of Chapter 16. The general idea is to take the top curve 
and revolve the region under it all the way to the axis, to get a bigger solid 
than you want. Now take the bottom curve and revolve the region under it 
all the way to the axis, to get a solid which you actually need to cut out of 
the big solid and throw away to get the desired solid. Finally, subtract the 
small volume from the big one. Indeed, consider the following three regions: 




The region we want to revolve is shown in the left-hand picture; it is the set 
difference of the region under the top curve down to the x-axis (in the middle 
picture above) and the region under the bottom curve down to the x-axis 
(in the right-hand picture). Now, regardless of whether you revolve about 
the : c-axis or the y-axis, the volume of revolution of the region we want is 
equal to the difference between the volume of revolution of the big region and 
the volume of revolution of the small region. For example, if you revolve the 
region about the x-axis, then you get a cone-like structure with chopped-off 
ends and a weird-shaped hole going through the middle of it from left to right. 
The solid is the set difference of the filled-in version (with no hole) and the 
hole itself: 





So, here’s what we conclude: 


If the region lies between two curves, find the difference 
between the two corresponding volumes of revolution. 



Let’s look at a concrete example. Consider the finite region between the curves 
y = 2a; 3 and y = x 4 , as shown on the next page. What is the volume of the 
solid formed by revolving the region about the : r-axis? 


















iee that the volume we want is 


7ry| dx = n (2a: 3 ) 2 dx — n (a: 4 ) 2 dx. 


out and check that the answer is 1024 丌 /63 cubic units, 
ing the same region about the y-axis? Since we’re just 
een two curves, we don’t have a particular bias toward 
so we should actually be able to do this either by the disc 
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The volume we want is the difference between the volumes of revolution of 
y = x 4 and y = 2x s . The first of these volumes is bigger than the second, 
since x 4 is to the right of 2a: 3 ; so let’s solve for x and put x\ = y 1 ^ 4 and 
X 2 = (2//2) 1 / 3 . Using the disc method, with a; and y switched (as in Variation 1 
above) and integrating between y = 0 and y = 16 (not from 0 to 2!), we see 
that the volume we want is 

/*16 / *16 / *16 / *16 . . 2 

J nxl dy- J -kx\ dy = IT J (y 1/4 .) 2 dy-n J dy 



This works out to be 64 冗 /15 cubic units after a bit of fiddling, which you 
should definitely try for practice. 

Let’s try to find the same volume by using shells. This time, we slice the 
region vertically: 



Since y\ = 2a; 3 is above 2/2 = 怎 4 , we take the difference of volumes as follows: 



which is 647r/15 cubic units — the same answer as the one we just found using 
the disc method, of course! Note that when we use the disc method, we are 
thinking of the solid we want as being formed by one bowl with another bowl 
hollowed out of it, whereas the shell method is more like a basin with another 
slightly smaller basin removed. You should try to sketch some pictures to see 
what’s going on here. 

This variation also applies when the area doesn’t go all the way down 
to the axis. For example, suppose we want to find the volume of revolution 
when the region between the curve y = 1 + V25 — x 2 and the line y = 1 is 
revolved about the x-axis. Note that the curve is the top half of the circle 
x 2 (y — l) 2 = 25 of radius 5 units centered at (0,1), so the region looks like 
this: 
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A typical strip is shown in the above picture. The width is dx, but the height 
isn’t 2/: it’s y — h. In the picture, h is shown as a positive number, so y — h 
is of course less than y, as it should be. If h happens to be negative, then 
the height of the strip is more than y … but of course then y — h is actually 
greater than y, since h is negative! Regardless of the sign of h, we see that the 
strip has height y—h^ so the volume of the corresponding disc is n(y — h) 2 dx, 
and the volume of the whole solid of revolution is n(y — h) 2 dx. 

In fact, the only difference between this formula and the regular disc 
method is that y has been replaced by the quantity (y — h). As we saw 
in Section 1.3 of Chapter 1, this has the effect of translating the standard 
picture, where the region goes down to the : r-axis，upward by h units (which 
is actually downward if h is negative). The only problem with this is that it’s 
possible that the line y = h is actually above the curve, like this: 



In this case, the height of the strip is h — y, not y — h. This doesn’t really 
matter in the case of the disc method, since you square the height anyway, 
but it’s good to be careful about these things. Besides, the shell method is a 
different story. 

Indeed, suppose we now want to find the volume of the solid formed by 
revolving the region below about the axis x = h: 



Here we have to use the shell method, since the thin side of the strip is 
perpendicular to the axis of revolution. A typical shell has height y and 
thickness dx units, but the radius is now x — h units instead of x units. You 
should check that you agree that the volume of the shell is 27r(x — h)y dx, so 
the total volume is 2n(x—h)y dx cubic units. Again, notice that this comes 
from the standard formula for shells given in Section 29.1.3 above once you 
replace x by (x — h). This has the effect of translating the standard picture to 
the right by h units, including the axis of revolution — all we’ve done is slide 
the picture over. 
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29.2 


How about revolving the same region about the line x = 21 This is actually 
a combination of Variation 1 and Variation 3, since the revolution is about an 
axis parallel to the y-axis，so we’ll swap x and y, and also replace x by (2 — x) 
to handle the translation. Note that ifs (2 — x) instead of (x — 2), since the 
region is to the left of the line x = 2. Also, the integral will have to be from 
1 to 8 since it’s with respect to y, not x. The volume is therefore 


tt ( 2 -x) 2 dy = n (2 - y 1/3 ) 2 dy, 


which simplifies to 8 丌 /5 cubic units. It’s a good idea to make sure that you 
can also work this out by finding the volume of a typical disc, noting that this 
time we have sliced the region into horizontal strips, as in Variation 1. 

Now, what about if we revolve the same region about x = —3? This is 
starting to get a little messy. If we use vertical strips, then we’ll need the 
shell method because the thin side of each strip is perpendicular to the axis 
of revolution. We’ll use a combination of Variation 2 and Variation 3. You 
see, thinking vertically, the region lies between the two curves y\ = a; 3 (on the 
top) and 2/2 = 1 (on the bottom). Also, the quantity x needs to be replaced 
by (x + 3) in the standard formula for shells. This means that the volume is 
given by 


27r(x-\-3)yidx— / 27r(x-\-3)y2dx = 2n / (x-\-3)x s dx — 27r / (a:+ 3) dx, 


which works out to be 259 丌 /10 cubic units. 

Let’s repeat the same example, this time taking horizontal strips. Now we 
have to use the disc method, since the thin part of each strip is parallel to 
the axis of revolution. We need to switch x and y, since the axis is vertical 
(Variation 1); we also have to think of the region as lying horizontally between 
the curves xi = 2 on the right and X 2 = y 1 / 3 on the left (Variation 2); finally, 
we need to replace x by x-\-3 (Variation 3), which will actually mean replacing 
Xi by + 3 and also 奶 by + 3. So this example uses all three of our 
variations! The standard disc volume is iry 2 dx\ change x and y to get 7rx 2 dy\ 
replace x by x-\-3 to get 7r(a:+3) 2 dy\ then integrate this from 1 to 8, separately 
for xi and $ 2 , and take the difference. This shows that the volume is 


( 7r(xi + 3) 2 d V~ 咖2 +3) 2 dy = 7T 乂 （2 + 3) 2 dy-7r 乂 (y 1/s + 3) 2 dy, 


which again works out to be 2597r/10 cubic units. At least we got the same 
answer as before! Again, it’s a good idea to convince yourself that you can 
find the volume of a typical disc. 

Anyway, that’s more than enough theory about volumes of revolution; you 
have to do a lot of practice problems if you want to master all the variations. 
For now, it’s time to look at finding the volume of more general solids. 


Volumes of General Solids 

Most solids can’t be formed by revolving some planar area about an axis in 
that plane. For example, a pyramid has no curvy surfaces, so it isn’t a solid 
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of revolution, no matter which way you look at it. One technique for finding 
the volume of such a solid is the method of slicing, which actually generalizes 
the disc method from Section 29.1.1 above. 

Imagine your solid is a vegetable, like a cucumber or a squash. You put it 
on a cutting board and chop it up into thin, parallel slices. The slices won’t 
all be the same size. Even the two exposed areas of an individual slice won’t 
always be the same. For example, in the case of a cucumber, the slices near 
the end will be a little skewed. On the other hand, if a slice is very thin, 
then its two exposed areas will be pretty close. So we’re going to approximate 
the volume of the slice by taking one of the exposed areas — it doesn’t matter 
which one — and multiplying by the thickness of the slice. Then we’re going to 
add up all the volumes and take the limit as the slice thicknesses all go down 
to zero. 

Now, in practice, this procedure is a little complicated. The fact is, there 
are many ways to cut the solid. For example, if you cut up a cucumber lying 
on its side, you get thin disc-like slices. If you stand the cucumber on its end, 
it’s more difficult to slice, but you could still do it. You’d end up with slices 
which look like ovals of different sizes. Or you could tilt the cucumber on an 
angle and get smaller ovals. 

Basically, here is your choice: you need to pick an axis, which doesn’t 
necessarily have to go through the solid. All your slices will be perpendicular 
to this axis. Once you’ve picked the axis, your way forward is clear: you 
need to find the cross-sectional area of every slice perpendicular to that axis. 
Different slices will have different areas. So, on your axis, you need to specify 
an origin and a positive direction, then work out the cross-sectional area of 
a slice through where x is an arbitrary point on the axis. The last step 
is to approximate the volume of the slice by the area multiplied by the tiny 
thickness dx, then integrate; this adds up the volume of all the slices, while 
simultaneously taking the limit as the maximum slice thickness goes down to 
zero. In summary, then, here’s the plan: 

1. Choose an axis. 

2. Find a typical cross-sectional area at a point x on the axis; call this area 
A(pc) square units. 

3. Then if V is the volume of the solid (in cubic units), we have 



where [a, b] is the range of x which completely covers the solid. 



Believe me, you really want to choose this axis so that the cross-sections are 
as simple as possible. It helps if you can ensure that the cross-sections are in 
fact similar to each other, that is, different-sized copies of each other. This 
isn’t always possible, though. 

Let’s use the above technique to find the volume of a “generalized” cone. 
What this means is that we have some shape in a plane of area A square units, 
and an apex point P which hovers some distance above the plane: 
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Now we draw a line segment from each point on the edge of the shape up to 
P. This gives us a surface whose base is the shape we started with. The solid 
we’re interested in is the filled-in version of the surface, or if you prefer, the 
interior of the surface. Here’s sort of what the surface looks like, in skeleton 
form: 



For example, if the base is a circle and the point P sits directly above the 
center of the circle, then we’ll get an ordinary cone. If the base is a square 
and P is directly above the center of the square (that is, where the diagonals 
of the square intersect), then we’ll get a square pyramid. You should think 
about what choice of base and point P gives you (a) a regular pyramid, or 
(b) a skew-cone (which looks like a weird hat — sort of like a witch’s hat but 
it doesn’t go straight up). It turns out that the only quantities which are 
relevant to finding the volume of the solid are the area of the base, A square 
units, and the perpendicular distance from P to the plane. We’ll call this last 
quantity h units (it’s labeled in the above figures). 

So, how do we find the volume? We first have to choose an axis. P seems 
to be a special point, so the line we choose should probably pass through P. 
Where else should it go? You could try all sorts of things, but the only thing 
that works is to make the line perpendicular to the plane which contains the 
base. Let’s also set the origin of the axis to lie at P, and the positive direction 
will be downward. (This might seem a little strange, but there’s no reason 
not to make downward positive. After all, the generalized cone might have 
been presented to us as balanced on its point, in which case upward would be 
positive.) This will make our calculations much easier. Let’s see what happens 
if we pick a point x on the axis and take a perpendicular slice through x: 
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which means that l = xL/h. Let’s just do a quick reality check on this 
equation. If a; = 0, then the slice is just through the top of the cone (P) and 
l should be 0, which it is. On the other hand, if x = /i, then the slice is just 
the base plane, and the cross-section isn’t a smaller copy of the base — it is 
the base. So of course, in that case l should equal L, which it does. 

Now let’s look at our base and our cross-section, with the line segments of 
lengths L and l units drawn in: 


Base Cross-section 


Area = A(x) 

These two figures, including the line segments, are similar — one is an exact 
magnification of the other. Now here’s an important principle of similarity. 
Say we have two similar figures, and we know the lengths of corresponding 
line segments, one on each figure. The line segments have to match exactly if 
we magnify one figure to be the same size as the other one. Then the ratio of 
areas of the figures is the square of the ratio of the two corresponding lengths. 
For example, if we take two square tiles, one with side length three times that 
of the other, then the area of the big tile is nine times that of the small one. 
So, going back to the picture above, the area of the base is A square units and 
the area of the cross-section is A(x) square units. So the ratio of the areas is 
the square of the ratio of the corresponding lengths, which are L and l units 
in our case: 



Simplifying and using our above expression for l, we get 
Al 2 A (xL^\ Ax 2 

Once again, a reality check: if x = 0, the cross-section is just the point P, 
which has no area. This checks out, since A(0) = ^4 x 0 2 //i 2 = 0. How about 
when x = hi Then we’re dealing with the base, so our cross-sectional area 
should be A square units. No problem: A(h) = Ax h 2 /h 2 = A. 

Finally, we’re ready to integrate! The only question is, what’s the range 
of x? Well, as we’ve seen, x = 0 is the top and x = his the bottom, so that’s 
the correct range of x. So, 



cubic units. 

Hey, so we just got the formula for the volume of any sort of pyramid or 
cone-like object. For example, for the regular old cone, the volume is ^7rr 2 h 
cubic units, which is exactly what we found above since A = nr 2 . Same thing 
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for a square pyramid — the volume is \l 2 h cubic units (where the side length 
of the base is l units), which works as well because the base area is given by 
A = l 2 . 

Let’s look at one more example. Take the curve y = e x between : r = 0 
and x = \ and consider the region between the curve and the x-axis. It looks 
something like this: 



Suppose you have a somewhat bizarre solid sitting on top of the above plane, 
sticking out of the page, whose base is exactly the shaded region. The solid is 
shaped in such a way that if you cut it straight down along any line parallel 
to the y-axis, then the cross-section is a rectangle whose long side lies in the 
base of the figure, and whose short side is half the length of the long side. 
Tipping the graph over a little in order to see the perspective, here’s what a 
few of the cross-sections look like: 



What is the volume of the solid? Let’s start by picking an axis. How about 
the 工 -axis? That sounds reasonable since we know what the cross-sections 
perpendicular to this axis look like. We already have an origin and a positive 
direction, so let’s stick with them. At the point x on the axis, the vertical 
line segment has length e x units. This is the length of the long side of the 
rectangle, so the short side has length \e x units (remember, the short side is 
half the length of the long side). The area of the rectangle is therefore 


A(x) =e x x -e x = -e 2x 
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square units. So the volume is 

厂 1/2 1 rl/2 ! e 2x|V2 工 

V = I A(x) dx = - e 2x dx = = - (e — 1) cubic units. 

Jo ^ Jo 2 2 | 0 4 


29.3 Arc Lengths 

Say we have a graph of y = f(x) for some function /， where x ranges from 
a to b. Take a piece of string and lay it on top of the curve, marking both 
ends, and then take it off the page, straighten it out, and measure the length 
between the marks. How do you calculate what the length would be? This 
length is called the arc length of the curve, and we’re going to find a formula 
for it. The strategy will be to get a sort of prototype expression, then to 
adapt this to get several useful versions of the formula. 

So, let’s look at a little piece of curve between x and x + dx: 



Let’s approximate the length of the curve between A and B by the length 
of the dotted line segment AB. The closer A and B are to each other, the 
better the approximation. By Pythagoras 5 Theorem, the length of AB is 
(da:) 2 + {dy) 2 units. Now we just need to repeat this process with lots of 
little line segments which flesh out an approximation to the curve, then add 
up the lengths and take some sort of limit. As usual, the integration takes 
care of the adding up and limiting parts, but you have to be careful. If you 
just put an integral sign in front of the little length (dx) 2 + (dy) 2 ， you’ll get 


arc length = j (dx) 2 + (dy) 2 . 

The problem is, this integral doesn’t really mean anything! We need to inte¬ 
grate with respect to one variable. Luckily you can adapt the above formula 
in a variety of cases to produce a meaningful result. For example, you could 
pull a factor of (dx) 2 out of the square root to express the little bit of arc 
length as y/1 + (dy/dx) 2 dx units, which seems much more promising. (That 
maneuver actually needs a proof, but the details are a little beyond the scope 
of this book.) Anyway, in each of the cases below, we’ll see how to adapt the 
above prototypical formula to get a legitimate formula for arc length: 
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1. If y = f(x) and x ranges from a to 6, then take out a factor of (dx) 2 in 
the above integrand (as we just did above) and pull it out of the square 
root to get 



(standard form). 


In terms of /, you can rewrite this as 


arc length = 




2. Suppose that x is given in terms of y. If x = g(y) and y ranges from A 
to B, then you take out a factor of (dy) 2 instead (or if you prefer, swap 
each occurrence of x and y in the boxed formula above) to get 



arc length 



(in terms of y), 


which can also be written as 

『b _ 

arc length = / y/l + (g’[y)) 2 dy. 

Ja 

3. How about the parametric form? This means that x and y are func¬ 
tions of a parameter t which ranges from to to t±. (See Section 27.1 in 
Chapter 27 for a review of parametric equations.) We can think of the 
quantity (dx) 2 as (dx/dt) 2 (dt) 2 and similarly for y. We can then pull 
the (dt) 2 out and take its square root to get the useful formula: 


arc length = 




(I) 


(parametric version). 


4. A special case of this last formula occurs in the case of polar coordinates. 
In particular, in Section 27.2.4 of Chapter 27, we saw how to find the 
area inside the curve r = f(0), where 6 ranges from Oq to now 
let’s find the arc length of the same curve. We know that x = r cos(6) 
and y = rsin(0), so replacing r by f(0), we have x = f(6) cos(0) and 
y = f(6) sin(0). Here 0 acts as a parameter, so we can use the above 
formula for arc length in parameters (with t replaced by 6). We’ll need 
to know what dx/d6 and dy/d6 are. By the product rule, 

尝 = f'(0) cos{6) - f(0) sin(6>) 

and 

塞 = f\6) sin(6l) + f{6) cos(6). 

Now you have to square both of these things and add them. Go on, 
try it! You’ll find that some terms cancel. Also you have two lots of 
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sm 2 (0) + cos 2 (0) terms which can be replaced by 1. Altogether, you 
should get the formula 


arc length 



Vimr + ifWde 


(polar, r = /(6»))_ 






29 . 3.1 


By the way, you should express all these arc lengths in units. 

Let’s look at some examples. Suppose you want to find the arc length of 
the curve y = ln(a:) where x ranges from y/S to y/15. We use the first formula 
above to see that 


L f^) dx 


This is actually quite a difficult integral. You should definitely try it as an 
exercise. If you get stuck, here is the plan of attack: start out with an 
appropriate trig substitution. If you do it right, the indefinite version of the 
integral becomes / sec 3 (6)/ tan(0) dO. To find this, express the numerator as 
sec(0)(l + tan 2 (0)) and break everything up into two integrals, which can be 
done using the techniques in Chapter 19. Check that you get an arc length 
of 2 + ln(3) — I ln(5) units. 

How about if you are looking for the arc length of the curve described in 
parameters as x = 3t 2 — 12( + 4 and y = 8v^^ 2 , where t ranges from 3 to 5? 
We have to use the parametric version of the formula. Indeed, dx/dt = 6t—12 
and dy/dt = 12\^2^/ 2 , so 



Now let’s look at the innermost part of the integrand. There’s a factor of 6 2 
which can be pulled out to get 

(6t - 12) 2 + (12V2t 1/2 ) 2 = 6 2 ((t-2) 2 + (2V2t 1/2 ) 2 ) 

= 36( 亡 2 -4t+4 + 8t) = 36(t + 2) 2 . 

It is now a simple matter to substitute this into the integrand and do the 
integration to see that the arc length is 72 units. I’ll leave the details to you! 


arc length 


= /:/ 

/*715 

= L 


m 


dx : 


Vx 2 -\-l 


Paftametrization ^ncf ： 

Before we move on to finding surface areas, there’s one little fact related to the 
arc length formula in parametric coordinates that I’d like to look at. Suppose 
an ant (not a snail, this time!) is crawling around on a flat piece of ground, 
and we define the ant’s position at time t seconds to be What is 

the speed of the ant at time t? Well, we know that velocity is the derivative 
of position with respect to time. So the ant’s velocity in the x direction is 
dx/dt and its velocity in the y direction is dy/dt. Its real speed has to involve 
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both of these velocities. In fact, by Pythagoras’ Theorem, we should have*: 




Hey, this is the quantity that we’ve been integrating to find arc length in the 
parametric case! Indeed, to find the total distance the ant has traveled, you 
have to integrate its speed. So we now have a meaning for the integrand in the 
formula for arc length, at least in the parametric case: it is the instantaneous 
speed of a particle moving along the curve, as described by the parameters. 

Consider the example at the end of the previous section where we have 
x = 3t 2 — 12t + 4 and y = Sy/2t 3 ^ 2 . From what we observed above, 


Speedi# V(f) + (f) =V36(i + 2)2 = 6(i + 2), 



where the answer is in units per second (assuming t is measured in seconds). 
This means that at time i = 3, the speed of a particle, whose position at time 
t is (x(t),y(t)), is 6(3 + 2) = 30 units per second; whereas at time t = 5, the 
speed’s a little higher at 6(5 + 2) = 42 units per second. 

In Section 27.1 of Chapter 27, we observed that the parametric equations 
x = 3 cos ⑷ and y = 3 sin ⑷， where 0 < t < 2tt, describe the circle of radius 3 
units centered at the origin. The speed of a particle moving as described by 
these equations is 


y ( 尝 ) + ( 尝 ) =^(-3sin(f)) 2 + (3cos(i)) 2 = \/9 = 3, 

since sin 2 (^) + cos 2 (t) = 1. This means that the particle moves at a constant 
speed of 3 units per second around the circle (counterclockwise, of course). 
On the other hand, we also observed that x = 3 cos(2t) and y = 3sin(2t), this 
time where 0 <t < tt, also describes the same circle. Now the speed is 

V ( 尝 ) + ( 尝 ) = V (-6 sin(2f)) 2 + (6 cos(2t)) 2 = VM = 6, 

so a particle observing this new parametrization does indeed move around the 
same circle twice as fast as the original particle. 


29.4 Surface Areas of Solids of Revolution ： 

The last thing we’ll consider in this chapter is how to find the surface area of 
a surface formed by revolving a curve about an axis. The method is a sort of 
combination of how we found arc lengths and volumes. We start by chopping 
the curve into small bits of arc, then concentrating on what happens to one 
of these small bits when we revolve it about the axis. Let’s suppose we are 


* We’re getting into vectors here; this really belongs in a book on multivariable calculus. 
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revolving about the x-axis. What happens to one of these little bits of arc 
when we revolve it? We get a sort of loop, but the side of it is pretty curvy. 
If the width of the loop is small enough, we should be able to approximate it 
by a straighter version. Let’s start off by approximating the arc by its secant 
line segment, just as we did in Section 29.3 above. As we saw, the length of 
that secant is yj(dx) 2 + (dy) 2 units. When we revolve that secant instead of 
the arc length, we get a loop whose outside is straighter, like this: 




The left-hand picture above shows a piece of the curve and the approximating 
secant; the middle picture shows the actual curvy ring whose surface area we 
want to find; and the right-hand picture shows the approximating loop which 
we’re going to use instead. Actually, we are even lazier than that: the side 
of the loop is not parallel to the ar-axis, so our loop is actually part of the 
surface of a cone. The area of such an object can be computed, but it’s really 
messy. Instead, we are going to do a further approximation and pretend that 
we are just dealing with a loop with the same side length, but now the loop 
is cylindrical: 
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The end result is that we have a cylindrical loop of radius y uni ts, and width 
\J(dx) 2 + (dy) 2 units, so it has surface area 2nyy/(dx) 2 + (dy) 2 square units. 
(That’s the circumference, 2iry units, times the width.) It turns out* that the 
approximation works in the limit as the loop width goes down to zero and 
we add up the surface areas of the loops, so we are led to the prototypical 
formula for revolution about the x-axis: 


surface area = J (dx) 2 H- (dy) 2 . (revolution about a:-axis) 

Alternatively, if the revolution is about the y-axis, then the loop we use has 
the same width, but the radius is now x instead of y units, so the prototypical 
formula for revolution about the y-axis is 


surface area = 


27TX^/(dx) 2 H- (dy) 2 • 


(revolution about y-axis) 


You can also see this along the lines of Variation 1 from volumes (see Sec¬ 
tion 29.1.4 above) by switching x and y in the first prototypical formula above. 

Anyway, just as in the case of arc lengths, these prototypical formulas can’t 
actually be applied to find any surface areas! Let’s see how we can modify 
the formulas so we can actually use them: 


1. Suppose we want to revolve the curve y = f(x) about the x-axis, where 
x ranges from a to b. We take out a factor of (dx) 2 in the integrand of 
the first prototypical formula and pull it out of the square root, just as 
we did in the case of arc length, to get 


fb / Z 


surface area = / 27ry\ 1 + (- 

dx 

Ja V \ c 

l-Xy J 


(about the a;-axis). 


In terms of /, it looks like this: 


surface area 



+ (/’ ⑻ ) 2 dx. 


2. If instead we want to revolve the same curve about the y-axis, the same 
manipulations applied to the other prototypical formula give 



(about the y-axis), 


or in terms of /, 


surface area = 


f 27 tx^1 + dx. 

J a 


*The computations involved are a little gross — if you want to try it, use the fact that 
the surface area of a frustum of a cone of radii r and R units and height h units is given 
by tt(R + r)-^/(R— r) 2 + b? square units. 
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3. Of course, there’s also a parametric form. If x and y are functions of a 
parameter t which ranges from to to ti, then dividing and multiplying 
by dt leads to the following formulas: 


广 1 

surface area = J 2 冗叫 

’⑼ 2 營 

and 

surface area = j 27rx^ 



( parametric version, 
about the a:-axis t 


( parametric version ,、 
about the y-axis. 


Again, all of these surface areas are in square units. 

Here’s an example: if the curve y = cos(x) from ^ = 0 to a; = 7r/2 is 
revolved about the x-axis, we need the formula from case 1 above to see that 
the surface area would be 



3( 怎 ) y 1 + sin 2 ⑷ dx. 


To evaluate this integral, first let t = sin(a:), then use a trig substitution 
to handle the new integral. Try it — the surface area should work out to be 
7r(\/2 + ln(l + \/2)) square units. 

On the other hand, the surface area resulting when the parabola y = x 2 /2 
between x = 0 and x = 2y/2 is revolved about the y-axis (not the a:-axis) can 
be found using the formula from case 2 above; since dyjdx = x, the surface 
area is given by 


乂 ^27^1+ ( 芸 ) 2 dx = 

it works out to be 52 丌 /3 square units after substituting t = 1 + x 2 . 

Now consider the semicircle which is the upper half of the circle of radius 
r units centered at the origin. This is parametrized by x = r cos(6) and 
y = rsin(0), where 9 ranges from 0 to 7r (we stop at 7r so we only get the 
upper half). If we revolve this semicircle around the : r-axis，we get a sphere, 
whose surface area is given by the first formula in case 3 above (with t replaced 
by 0): 


d0 = 2n J r sin(0)(—r sin(0)) 2 + (r cos(0)) 2 d6. 

You can now use the fact that sin 2 (0) + cos 2 (6) = 1 to see that the surface 
area works out to be 47rr 2 square units, justifying the classic formula. 

Finally, let’s consider the surface area analogue of Variation 3 from volumes 
of revolution (see Section 29.1.6 above). If the axis of revolution is not the 
x-axis, but is instead the line y = h (which is parallel to the x-axis), then the 
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radius of the cylindrical loop is y — h units, not y units, so the formula from 
case 1 above needs to be modified appropriately: 


surface area 



(about y = h). 


(Actually, if the curve is under the line y = h, you’d better use h — y instead 
of 2 / — ft or you’ll get a negative answer for your surface area!) Again, you 
shouldn’t learn the above formula separately; instead, understand how to 
derive it from the one you already know. In fact, you should now be able to 
modify all the other formulas above to allow for revolution about y = h or 
x = h a,s appropriate. 




CHAPTER 30 


Differential Equations 

A differential equation is an equation involving derivatives. These things 
are really useful for describing how quantities change in the real world. For 
example, if you want to understand how fast a population grows, or even how 
quickly you can pay off a student loan, a differential equation can help model 
the situation and give you a decent answer. In this final chapter, we’ll see 
how to solve certain types of differential equations. In particular, here’s what 
we’ll look at: 

• an introduction to differential equations; 

• separable first-order differential equations; 

• first-order linear differential equations; 

• first- and second-order constant-coefficient differential equations; and 

• modeling using differential equations. 

30.1 IrrlodllQtiCiri to Differential Equations 


We’ve already seen an example of a differential equation when we looked at 
exponential growth and decay, way back in Section 9.6 of Chapter 9. We 
considered the equation 

t = kv ， 

where k is some fixed constant, and claimed that the only solutions to it 
are of the form y = Ae kx for some constant A. We’ll prove this claim in 
Section 30.2 below. By the way, we shouldn’t be surprised to see a constant 
like A popping up. After all, the original equation involves a derivative; the 
only way to unravel a derivative is to integrate it, and integration introduces 
an unknown constant (think +(7). 

The equation dy/dx = ky is an example of a first-order differential equa¬ 
tion. This is because there’s only a first derivative floating around. In general, 
the order of a differential equation is the order of the highest derivative in- 
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volved. For example, the nasty equation 




d 4 y 

dx 4 


- sin(a:) 



+ e x y = tan(a:) 



is a fourth-order differential equation, since there is a fourth derivative in¬ 
volved but no fifth or higher derivative. 

Now consider a specific example of the first-order differential equation at 
the beginning of this section, but with an extra condition: 


dy 0 

Tx=- 2y ' 


2/(0) = 5. 


This means that not only do you need the differential equation to be satisfied 
by your solution, you also need to ensure that when you set x = 0, you get 
y = 5. We know that y = Ae kx is the general solution to the differential 
equation dy/dx = ky] by setting k = —2, we see that the general solution of 
the above differential equation is y = Ae~ 2x for some constant A. Now put 
x = 0 and y = 5 to see that 5 = Ae _2 ( 0 ), or simply A = 5. The extra piece of 
information y(0) = 5 has allowed us to pin down the value of A, so the actual 
solution is y = 5e~ 2x . 

What we have just been looking at is an example of an initial value prob¬ 
lem^ or IVP. The idea is that you know a starting condition (in this case, 
y(0) = 5) as well as a differential equation that tells you how the situation 
evolves from there (in this case, dy/dx = —2y), and you can use these two 
facts to find out the exact solution with no undetermined constants. For a 
second-order differential equation, you effectively need to integrate twice, so 
you’ll get two undetermined constants; it follows that you need two pieces 
of information. Normally these would be the value of y(0) as well as the 
value of y’(0) (the derivative when x = 0). We’ll see some examples of this in 
Section 30.4.2 below. 

Now, the study of differential equations is pretty bloody huge. These 
things are hard to solve. In fact, they are basically impossible, at least in 
general. Luckily, there are some simple types which can be solved without 
too much trouble. We’re going to look at three such types: first-order sep¬ 
arable equations, first-order linear equations, and linear constant-coefficient 
equations. 


3D.2 Separable Firstnocder Differential Equofions 

A first-order differential equation is called separable if you can put all the 
y-stuS on one side (including the dy), and all the a:-stuff on the other side 
(including the dx). For example, the equation dy/dx = ky can be rearranged 
to read 1 

马 dy = dx, 

so it is separable. As another example, the equation 

字 + cos 2 (t/) cos ⑻ = 0 
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can be rearranged (check out the algebra yourself!) into 
sec 2 (y) dy = — cos ⑷ dx. 


Now, the way to continue is simply to whack integral signs on both sides and 
integrate, then rearrange* to solve for y. In the first example, we get 

lvy dV = S dX ' 

which becomes 

^ My\ =x + c, 

where (7 is a constant. To solve for y, multiply by k and take exponentials. 
We get 

\y\ = e kx+kC = e kC e kx . 

This means that y = ±e kC e kx . Now, ±e kC is just some other nonzero con¬ 
stant, so let’s call it A, giving the solution y = Ae kx as we expected. (In 
fact, A can even be 0: indeed, if y = 0 for all x, the equation dy/dx = ky is 
obviously satisfied since both sides are 0. The reason this didn’t come up in 
our solution above is that we divided by y\ this assumed that y is never 0.) 
As for the second example above, integrating both sides gives 

J sec 2 (y) dy = J — cos ⑷ dx, 


which leads to 


tan(y) = sin ⑷ + C ， 

where C is a constant. This is perfectly good as a solution, but maybe you 
are tempted to write 

y = tan _1 (sin(a;) + C). 


The problem with this is, the inverse tangent function has range (— 7r/2, 7 t/2) 
only. We should be able to add any integer multiple of 7r to the above ex¬ 
pression and still get a valid solution. Indeed, sec 2 (y) has period 7r, so the 
complete solution should be 


y = tan _1 (sin(a:) + C) + mr, 


where C is a constant and n is an integer. Maybe we should just avoid these 
issues by leaving it as tan(y) = sin(a:) + C. (Again, we divided by cos 2 (y) 
right at the beginning of the solution; this caused us to miss the constant 
solutions y = n7r/2, where n is an odd number, since that’s when cos 2 (y) = 0. 
These solutions arise as C ^ 士 oo in the above solution.) 

How about the same example, but as an IVP (initial value problem)? For 
example, consider the IVP 

^ + cos 2 (y) cos(a:) = 0 ， y(0) = 


! As you might expect, these maneuvers can be fully justified by using the chain rule. 
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If you solve the differential equation using the above technique, you end up 
with 

tan( 2 /) = sin ⑷ + C 

as before. Now put x = 0 and y = 7r/4 to get 

tan(7r/4) = sin(O) + (7 ， 
which means that (7 = 1. So we have 

tan(y) = sin ⑷ + 1. 

Now if we write 

y = tan _1 (sin(a:) + 1) + n/K, 

where n is an integer, we can again put x = 0 and y = 7r/4 to see that 
7r/4 = tan _1 (l) + ri7r, which means that n = 0. So it’s fair to write the 
solution as 

y = tan 一 1 (sin ⑷ + 1). 

To see this a bit more clearly, imagine that the initial condition is y(0) = 5 丌 /4 
instead of 2/(0) = 7r/4. Plugging this into the equation tan(y) = sin(a:)+C once 
again leads to C = 1, since tan(57r/4) = 1. So once more, we find that we have 
tan(y) = sin (a:) + 1， but it’s a mistake to write this as y = tan _1 (sin(a:) + 1). 
Why? When x = 0, we have 

y = tan _1 (sin(0) + 1 ) = tan _1 (0) = $, 

which isn’t what we want. So we need to add 丌 to make it work: 

y = tan _1 (sin(a:) + 1 ) + 7r. 

Now the differential equation is satisfied, and y(0) = 5 兀 /4 as we wanted. The 
same precaution would be required if the initial condition were y(0) = 7r/4+n7r 
for any nonzero integer n. These things require a delicate touch! 

30.3 Firs'frorder Linear Equations ； 

Here’s a different type of first-order differential equation: 

尝 +P{x)y = q(x), 

where p and q are given functions of x. Such an equation is called a first-order 
linear differential equation. It may not be separable, and it may not even look 
particularly linear! For example, 

尝 + 6x 2 y = e~ 2x3 sin(aj) 

doesn’t look very linear, yet this equation is indeed first-order linear. The 
reason is that the powers of y and dy/dx are both one. So something like 

^+6a ： V = e- 2x3 sin(a ； ) 
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is no good, since y s is not first-degree in y. Similarly, 

(¥) + = e~ 2x3 sin ⑷ 

is also not linear because the quantity dy/dx is squared. 

Let’s go back to the linear equation from above, 

学 + 6x 2 y = e- 2x3 sin ⑷ • 
ax 

This equation isn’t separable. Try it! You won’t be able to get all the y-stuff 
on one side and all the x-stuff on the other side. Luckily, there’s a neat trick 
that will save the day. Imagine that we multiply both sides by the quantity 
e 2x . This certainly makes the right-hand side nicer, as it happens, but there’s 
actually a more interesting effect. Let’s see what happens: 

e 2^g + 6a; 2 e 2x^ 

=sin ⑷. 

Watch carefully, now: there’s nothing up my sleeve as I rewrite this as 

絲，〜 ‘卜血 ㈤. 

How is this possible? Well, all I had to do was mentally reverse the product 
rule while differentiating implicitly! (Piece of cake ...) To see that this is 
correct, all you have to do is differentiate it out. Indeed, by the product rule, 
one term is e 2x times the derivative of y, that is, e 2x (dy/dx)] the other term 
is y times the derivative of e 2x3 , that is, y x 6x 2 e 2x3 (using the chain rule). 
But that’s exactly what the original left-hand side was! So we do indeed have 

丟 (〆") = _)• 

Now all we have to do is integrate both sides with respect to x. This cancels 
out the derivative on the left-hand side, leaving 

e 2x3 y = J sin(a;) dx 4^cos{x) + C. 

Dividing by e 2x3 , we get the solution 

y = (C — cos ⑷) e - 2x3 , 

where C is an arbitrary constant. Now try differentiating this and check that 
it satisfies the original differential equation! 

The key to the previous solution was multiplying by e 2x ' When we did 
this, we were able to wrap the left-hand side into 嘉 (stuff), which could be 
integrated easily. For this reason, the quantity e 2x is called an integrating 
factor. It turns out that for the general first-order linear differential equation 

尝 +P{x)y = q(x), 
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It’s not a bad idea to check that this simplification is valid by differentiating 
the left-hand side. In any case, integrate both sides of the above equation to 
get 



To do this integral, set t = e x , so that dt = e x dx. Note that you have to 
write e 2x as e x e x to make it work. I leave it to you to do the integral (using 
integration by parts) and check that the resulting equation is 

e~ eX y = -e x e~ eX - e? + 

Finally, divide through by the integrating factor e~ eX to get 
y = -e x -l + Ce eX 

for some constant C. Now all that’s left is to solve the IVP. When a: = 0, we 
know that y = 2(e — 1), so inserting this into the above equation, we have 

2(e - 1) = -e°-l + Ce e °. 

You can easily solve this to see that C = 2, so the final solution is 
y = 2e eX -e x -l. 

Check by differentiating that this satisfies the original differential equation. 

Let’s quickly go through one more example of a first-order linear differen¬ 
tial equation: 

tan(;r)^ = e sin ( x ) — y. 

First, put the y-stuff on the left and divide by tan (a:) to make the coefficient 
of dy/dx equal to 1: 

^ + cot(x)y = cot(x)e sin ⑻. 

The coefficient of y is cot ⑷， so 

integrating factor = e f cot ^ dx = e ln ^ sin ^ = sin(a:). 

(Technically we should have written |sin(a:)|, but this complicates things un¬ 
necessarily.) Anyway, multiply the differential equation through by sin ⑷ to 
get 

sin(a:)^ + cos(x)y = cos(a:)e sin ⑻， 

since sin(a:) cot (a;) = cos(a:). Now the left-hand side factors into the derivative 
of y times the integrating factor (check it): 

-^-{ysm(x)) = cos(x)e sin ( x ). 

Iqntegrate both sides (use a substitution to simplify the right-hand side): 
ysin(x) = f cos(a:)e sin ⑻ cte = e sin ⑻ + C. 
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Finally, divide through by sin ⑷ to get 

y = csc(a:)e sin(x )+CcscOr), 


and we have found the solution to our differential equation. 

In summary, here’s the method for dealing with first-order linear differen- 
I tial equations: 

• Put the stuff involving y on the left-hand side and the stuff involving x 
on the right-hand side, then divide through by the coefficient of dy/dx 
to get the equation into the standard form 

尝 +P{x)y = q{x). 

• Multiply through by the integrating factor, which we’ll call f(x), given 

by _ 

I integrating factor f(x) = e^ p ^ dx I 

where no +C is needed in the integral in the exponent. 

• The left-hand side becomes 盖 (/ ⑷ 2/), where f(x) is the integrating 
factor. Rewrite the equation with this new left-hand side. 

• Integrate both sides; this time you must put a +C on the right-hand 
side. 

• Divide by the integrating factor to solve for y. 

Practice this and you won’t regret it! 

30.3.1 Why the integrating factor works 

Why is the weird expression dx a good integrating factor? Well, suppose 
we take our general equation 

尝 +P{x)y = q(x) 

and multiply it by the integrating factor dx . We get 

dx ^ + e fp(x) dx p{x)y = stuff in a:. 

I’m really focusing on the left-hand side for the moment, so I just wrote “stuff 
in x v on the right. Now, we have claimed that we can rewrite the left-hand 
side so that the above equation becomes 

去 (e " ⑻ = stuff in x; 

this is much easier to deal with. To prove our claim, use the product rule on 
the left-hand side to write it as 

e fp(x)dx^y_ _|_ ⑻ da;) ^ 

dx dx V ) y 
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That’s almost what we need; we just have to use the chain rule to write 

丟 ( e /p ( 抽 ) = 丟 (/ p(x)dx^j x e /p (抽 =p(x)e-f p ^ dx . 

Note that ^ / p(x)dx = p(x), since f p(x) dx (without the +C) is an an¬ 
tiderivative of p. Now if you assemble all the pieces from above, you can see 
that 

e fp(x)dx^ + e Sv{x)dx p ^ y= ^_ ( e / P (o:)<fa y ) 

after all. Our method works! 


30.4 Constanl-coefficient Differential :Equati_s 


◎ 


Now it’s time to look at linear differential equations with constant coefficients. 
These equations look something like this: 

d n y , , d 2 y dy 

Un d^ + "' + a2 d^ + ai d^ + aoy = f{x) - 

Here / is some function of x only, and a n ,..., ai, ao are just plain old constant 
real numbers. Notice that the left-hand side of the above equation looks a 
bit like a polynomial in y, except that instead of taking powers of y, we are 
taking derivatives. 

Let’s look at a first-order example. Consider the differential equation 
3 字 - sin(5:r) = 12rr - 6y. 


This can be rearranged to put all the purely : r-stuff on the right and the y-stuff 
(including the derivative) on the left. Finally, divide by 3 to get 


dy_ 

dx 



This is a first-order constant-coefficient linear equation. In fact, you can solve 
it by means of the techniques described in the previous section on first-order 
linear equations. If you do it that way, you’ll need to use an integrating factor, 
which is actually a bit of a pain in this case (try it and see!). We’ll soon look 
at another method to deal with such equations; in fact, we’ll solve the above 
example in Section 30.4.6 below. 

We’ll also examine the second-order case in some detail. In this case we 
are dealing with equations like 


for example, 


4^ +b t +cy=f 


S - 心 = 2 以 
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We’ll see how to solve this in Section 30.4.6 below. First, we need to look at 
some general ideas for solving both first- and second-order constant-coefficient 
linear equations.* 

Let’s start by considering a simple case: assume there’s no stuff in x on 
the right-hand side. Two such examples are 

^-3^ = 0 and ^|-^+20 y = 0. 
ax ax z ax 

Such equations are called homogeneous. Let’s look at how to solve first-order 
(like the left-hand example above) and second-order (like the right-hand one) 
homogeneous equations. 


30.4 1 询 

This is pretty easy. The solution to 


dy_ 


ay = 0 


is just y = Ae~ ax . (In fact, this equation is simply dy/dx = ky with k = —a; 
see Sections 30.1 and 30.2 above.) For example, given the differential equation 


3y = 0, 


you can simply write down the solution y = Ae^ x , where A is some constant. 


30.4.2 Solving second-order homogeneous equations 

This case is a little more involved. We need to solve 



+b t +cy = Q . 


◎ 


Although it might seem a little strange, the easiest way to do this is to pluck a 
quadratic equation seemingly out of thin air. The quadratic equation, called 
the characteristic quadratic equation ， at 2 -\-bt-\-c = 0. For example, consider 
the following three differential equations: 


(a) y" - 〆 - 20y = 0 (b) y 〃 + 6y / + 9y = 0 (c) y n - 2y f + 5y = 0. 

Notice that we have written y r instead of dy/dx and y" instead of d 2 y/dx 2 . In 
any case, the characteristic quadratic equations of these three examples are 
t 2 — t — 20 = 0, t 2 -\- 6t 9 = 0, and t 2 — 2t 5 = 0, respectively. 

The next thing is to find the roots of the characteristic quadratic. There 
are three possibilities, depending on whether there are two real roots, one 
(double) real root or two complex roots. Let’s summarize the whole method, 
then solve the above three examples. 


* These ideas also work for higher-order equations, but we will concentrate on first- and 
second-order equations in this book. 
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How to solve the homogeneous equation ay" + by' cy = 0: 

1. Write down the characteristic quadratic equation at 2 -\-bt c = 0 and 
solve it for t. 

2. If there are two different real roots a and /?, the solution is 

y = Ae ax -h Be^ x . 

3. If there is only one (double) real root a, the solution is 

y = Ae ax + Bxe ax . 

4. If there are two complex roots, they will be conjugate to each other. 
That is, they must be of the form a 士 i(3. The solution is 

y = e ax (Acos(Px) Bsm(/3x)). 

In all three cases (2, 3 and 4), A and B are undetermined constants. 

So, for example (a) above, we saw that the characteristic quadratic equa¬ 
tion is t 2 — t — 20 = 0. If you factor the quadratic as (t + 4)(t — 5), it’s clear 
that the solutions to the equation are t = —4 and t = 5. By step 2 above, we 
see that the solution to our equation y n — y’ 一 20? / = 0 is given by 

y = Ae~ 4x + Be 5x , 

for some constants A and B. 

The characteristic quadratic equation t 2 -\-6t-\-9 = 0 in example (b) reduces 
to (t + 3) 2 = 0, so the only solution is ^ = —3. By step 3 above, the solution 
to the homogeneous equation y" + 6y’ + 9 = 0 is 

y = Ae~ 3x + Bxe~ Sx . 

S Finally, if we use the quadratic formula to solve the characteristic quadratic 
equation t 2 -2t-\-5 = 0 of example (c), we get t = l±2i. (Try it and see!) So, 
/ ^ with a = 1 and " = 2, step 4 above says that the solution to y" — 2〆 + = 0 

is 

y = e x (Acos(2x) + Bsm(2x)). 

Once again, A and B are undetermined constants. 

80.4 3 Why the oh{^^teflstio ； qucRdrptiG'works 

Now let’s see why the above method works. (If you don’t care why, you’d 
better move on to the next section!) Otherwise, consider what happens when 
you put y = e ax in the equation ay" + by r cy = 0. We have y' = ae ax and 
y" = a 2 e ax , so 

ay" + by' + cy = aa 2 e ax + bae ax + ce ax = (aa 2 -\-ba + c)e ax . 

So, if a is a root of the characteristic quadratic at 2 + bt -h c, then we have 
aa 2 -h ba + c = 0. The above equation now implies that ay f, + by’ -\-cy = 0 — 
that is, y = e ax solves our differential equation! Also, any constant multiple 
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S of this solves the equation, and if you have another root /?, then you can add 
the two solutions y = Ae ax and y = Be^ x to get more solutions (try it and 
/ ^ see). That takes care of step 2 above. 

Let’s look at step 4 next. If the two solutions to the quadratic are complex 
conjugates of the form a + then by the same argument as for step 2, the 
solution must be 

y = Ae^ a+i ^ x + Be (a - 明 x = e ax {Ae i(3x + Be~ i(3x ), 

where A and B can even be complex numbers. Now you can use Euler’s 
identity (see Section 28.2 in Chapter 28) to see that 

y = e ax (A(cos(Px) + B(cos(Px) — 

=e ax ((A + B) c,os{(3x) (A — B)i sin(Px)). 

Relabel the constant (^4 + B) as A and the constant (A — B)i as B to get the 
correct formula. 

Finally, for step 3, suppose the characteristic quadratic has just one root, 
a. If you substitute y = xe ax into the differential equation ay" by’ + q/ = 0, 
you can use y' = axe ax + e ax and y" = a 2 xe ax + 2ae ax to see that 

ay” ~\~by f -\-cy = (aa 2 -\-ba-\- c)xe ax + (2aa + b)e ax . 

If a is a double root of at 2 -\-bt-\-c, then not only does aa 2 + 6a + c = 0, but 
also 2aa + 6 = 0.* This leads to the correct solution from step 3 above. 

30.4.4 Nonhomogeneous equations and particular solutions 

Now let’s see what happens if we do have some stuflF in x alone, which we put 
on the right-hand side. For example, consider the differential equation 

y" -y' -20y = e x . 

This isn’t homogeneous because of the e x term on the right-hand side. Sup¬ 
pose we try to guess a solution. We know that the derivatives of e x are all 
e x , so let’s try y = e x . Then y f = e x and y" = e x , so the left-hand side 
y" — y’ 一 20y becomes e x — e x — 20e x = —20e x . That’s not equal to the 
right-hand side, but it’s pretty close. We just have to divide by —20. So, let’s 
try again: set y = —春 e x . Then y’ and y" are also --^e x , so we have 



So we have shown that y = —士 is a solution to our original equation 

* Here’s why 2aa + 6 = 0 if the quadratic at 2 + + c = 0 has a double root at t = a: 

the discriminant is 0, so b 2 = 4ac. Then 

(2aa + b) 2 = 4a 2 a 2 + 4a6a + 6 2 = 4a 2 a 2 4 - 4aba + 4ac = 4a(aa 2 + 6a + c) = 0. 

Since (2aa + b) 2 = 0, we also have 2aa + 6 = 0. 
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which actually solves the differential equation; then add any solution to the 
homogeneous version of the differential equation; the result is still a solution 
to the original differential equation. Furthermore, all the solutions to the 
nonhomogeneous equation are in this form. 

The same methodology works for both the first-order and the second- 
order cases. The only issue is how to guess the particular solution. In the 
next section, we’ll see how to make a guess of what the form of the solution 
should be (this is similar to the partial fraction technique from Section 18.3 
of Chapter 18). Then if you’re lucky, you can plug in that form and find the 
unknown constants in order to nail down the particular solution. 
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304'5. 

◎ 


Differential Equations 

Here’s a summary of our methods so far: 

1. Rearrange the equation into the correct form. That is, put all the cc-junk 
on the right-hand side. You should be able to reduce the equation to 

t +av = ⑽ 

for the first-order case, or 

for the second-order case. 

2. Using the techniques from Sections 30.4.1 and 30.4.2 above, solve the 
associated homogeneous equation 

省 +ay=o or 4^ +b t +cy=o . 

The solution, which we’ll write as yu, will have one or two undetermined 
constants in it (depending on whether the equation is first- or second- 
order) .We call yn the homogeneous solution of the equation. 

3. If the original function / is actually 0, then we’re already done; the 
complete solution is y = y 丑 . 

4. On the other hand, if the function / is anything other than 0, then 
write down the form for the particular solution yp (see Section 30.4.5 
below). The form will have some constants which must be determined. 
Substitute yp into the original equation and equate coefficients to find 
the constants. 

5. Finally, the solution is y = yu + yp. 

We’ll look at what happens if you are dealing with an initial value problem 
(IVP) in Section 30.4.8 below. Meanwhile, let’s see how to find a particular 
solution. 

Rtncling a pa_rfictjlar:_futiGR 

So far, we have blissfully ignored the stuff involving x which could appear on 
the right-hand side (it was called f(x) earlier). Now it’s time to deal with it. 
The tactic is to write down the form of the particular solution, then to find 
the actual solution by plugging the form into the equation. The table on the 
next page shows how to come up with the correct form. For example, in the 
differential equation 

y r — 3t/ = 5e 2x , 

the right-hand side is a multiple of e 2x ; so the table indicates that the form 
should be y p = Ce 2x , where C is a constant that we have to find by substi¬ 
tuting yp into the original equation. It’s easy to see that y’ P = 2Ce 2x , so we 
have 

2Ce 2x - 3{Ce 2x ) = 5e 2x . 

This reduces to —Ce 2x = 5e 2x , soC = —5. The particular solution is therefore 
yp = —5e 2x . In fact, since we saw in Section 30.4.1 above that the solution 
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30.4.6 Examples of finding particular solutions 

Once you’ve written down the form for yp, you still have to substitute yp 
into the original differential equation in order to find the constants. To make 
the calculation easier, you should first calculate y' P and yp (for the first order 
case, you actually only need y r P ). Let’s look at one example of this; then we’ll 
finally go back and complete the two unresolved examples from Section 30.4 
above. 

First consider the differential equation 

y" — Ay r + % = 25e 3x sin(2a:). 

Let’s quickly dispense with the homogeneous part; in fact, the characteristic 
quadratic equation for y" _ 4〆 + 4y = 0 is t 2 _ 46 + 4 = 0, which has one 
solution, namely t = 2. So we have yn = Ae 2x + Bxe 2x , where A and B are 
constants. Now let’s look for a particular solution. Break up the right-hand 
side of our differential equation, 25e 3 $ sin(2:r), into two components: 25e 3x 
and sin(2a;). According to the above table, the form for a constant multiple 
of e Sx is Ce 3x ; and the form for sin(2x) is C cos(2x) Dsin(2x). We need to 
multiply these together, but we can consolidate the constants as we do so and 
write 

yp = e Sx (Ccos(2x) + _Dsin(2x)) 

as our form. Now, let’s do some fiddly calculations using the product rule 
many times: 

yp = e 3x (Ccos(2a:) Dsin(2x)), 

y f P = e 3x (-2Csin(2:r) + 2Dcos(2a:)) + 3e Sx {Ccos{2x) -^Dsm(2x)) 

=e 3x ((3C + 2D) cos(2x) + (3D- 2C) sin(2x)), 

= e 3x (-2(3C + 2D) sin(2x) + 2(3D - 2C) cos(2rr)) 

+ 3e 3x ((3C + 2D) cos(2x) + {3D- 2C) sin(2a:)) 

=e 3x ((5C+ 12D) cos ⑽十 (5D - 12C) sin(2 忠 )). 

Now it’s time to substitute this mess into the original differential equation 
y" — 4y’ + 4y = 25e 3x sin(2a:). We get the gross-looking equation 

e 3x ((5C+ 12D) cos(2a;) + (5D - 12C) sin(2a:)) 

- 4e 3x ((3C + 2D) cos(2a;) + (3D- 2C) sin(2a;)) 

+ Ae Sx (C cos{2x),^tDsm{2x)) = 25e 3x sin(2a;), 
which mercifully simplifies to 

e Sx {4D - 3C) cos(2o:) + e 3x (-4C- 3D) sin(2a:) = 25e 3x sin(2a:). 

To make these expressions equal for all the e 3x cos(2x) stuff has to disappear 
and the coefficient of e Sx sin(2$) must be 25. This means that AD — 3C = 0 
and —4(7 — 3D = 25. Solving these equations simultaneously, you should get 
C = —4 and D = —3. We now know that yp = e 3a: (—4cos(2a:) — 3sin(2a:)), 
so the complete solution is 

V = Vh ~\~yp = Ae 2x + Bxe 2x — e Sx (4 cos(2a:) + 3sin(2a :))， 
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where A and B are constants. 

Now it’s time to finish off two of the examples from Section 30.4 above, as 
promised: 

y f -\-2y = 4：x sin(5a:) and y" — hy' + 6y = 2x 2 e x . 

At this point, you should try to solve them both. Once you’ve done that, read 
on. 

The left-hand example is a first-order equation. The homogeneous version 
is y f -\-2y = 0, which has the solution y = Ae~ 2x , where A is a constant. Upon 
consulting the above table, we see that the form for a particular solution is 
yp = ax b -\- Ccos(5a:) + _Dsin(5a:). We’ll need the derivative, namely 
y’ P = a — 5Csin(5ar) + 5_Dcos(5a:). Substituting y f p and yp into the original 
equation, we get 

(a—5C sin(5x)-\-5D cos(5ar))+2(aa;+6+C cos(5x)-\-D sin(5a:)) = 4x+^ sin(5a:), 
which reduces to 

2ax H- 26 + a 4- (5D + 2(7) cos(5:r) + (2D — 5(7) sin(5x) = 4a: + ^ sin(5a:). 

Now we have to equate coefficients of various components of this expression. 
The coefficient of x is 2a on the left-hand side and 4 on the right-hand side, 
so a = 2. The constant coefficient on the left is 26 + a, whereas there’s no 
constant on the right, so 26 + a = 0. This means that b = —1. Meanwhile, 
there’s no term in cos(5a:) on the right, so bD + 2(7 = 0. On the other hand, 
the sin(5a:) terms must match, so we have 2D — 5(7 = 1/3. Solving these last 
two equations simultaneously (try it!) gives C = —5/87 and D = 2/87. So, 
we have 

5 2 

y P = 2x-l - — cos(5a;) + — sin(5a:); 

O ( O ( 

putting it all together, we get the solution 

5 2 

y = VH -\-yp = Ae~ 2x -\-2x-l - — cos(5a:) + — sin(5a:), 

o/ o ( 

where A is a constant. 

How about the other example above? That’s a second-order equation, 
with homogeneous version given by y" — 5y f + 6y = 0. The characteristic 
quadratic equation is 亡 2 — + 6 = 0, which has solutions t = 2 and t = 3. So, 

yn = Ae 2x + Be Sx , where A and B are constants. Now it’s time to deal with 
the particular solution. Since the right-hand side of the original differential 
equation is 2x 2 e x , the form should be yp = (ax 2 -\-bx-\- c)e x ] remember that 
you don’t need a constant outside of the e x , since that constant could be 
absorbed into a, b and c. Let’s differentiate yp a couple of times: 

yp = (ax 2 - \-bx-\- c)e x , 
y f P = (ax 2 -\-bx-\- c)e x + (2ax + b)e x 
=(ax 2 + (2a + b)x + (6 + c))e x , 
yp = {ax 2 + (2a + b)x + (6 + c))e x + (2ax + (2a + b))e x 
=(ax 2 + (4a + b)x + (2a + 26 4 - c))e x . 
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Now substitute into the original equation y r — 5y’ + 6y = 2x 2 e x to get 

(ax 2 + (4a + b)x + (2a + 26 + c))e x — h(ax 2 + (2a + b)x + (6 + c))e x 

+ 6(ax 2 -\-bx-\- c)e x = 2x 2 e x . 

This simplifies to 

(2ax 2 + (—6a + 2b)x + (2a — 36 + 2c))e x = 2x 2 e x . 

Now equate coefficients to see that 2a = 2, —6a+ 26 = 0 and 2a — 36 +2c = 0. 
This means that a = 1, 6 = 3 and c = so yp = (x 2 + 3a: + |)e x . The 
solution to the whole equation is therefore 

y = Vh + yp = Ae 2x + + (a: 2 + 3x + ■) e x , 

where A and B are constants. 


額 .4 ,, fcpMng conflicts between yp and yn 

@ The last line of the table in Section 30.4.5 above indicates that there might 
be conflicts between yp and yu. How can this happen? Well, consider the 
differential equation 

y" - y + 2y = 7e 2 ' 

The homogeneous version is y r, _ 3y’ + 2y = 0, with characteristic quadratic 
equation given by i 2 —3t + 2 = (t — l)(t — 2) = 0, so the homogeneous solution 
is 

y H = Ae x + Be 2x . 

Here A and B are unknown constants. Now, since the right-hand side of the 
differential equation is 7e 2x , our table says that the form for the particular 
solution is yp = Ce 2x . The sad fact, alas, is that this choice will crash and 
burn. Indeed, this yp is included in yn by setting A = 0 and B = C. This 
means that if you plug yp = Ce 2x into the differential equation, you will get 0 
on the left-hand side (try it!), so it doesn’t work. Instead, as the final line of 
the table indicates, you need to introduce an extra power of x to make it work. 
So, we’ll use yp = Cxe 2x instead. Let’s see what happens now. First, note 
that y' P = 2Cxe 2x + Ce 2x and y f p = ACxe 2x + 4Ce 2 ' so when you substitute 
into the differential equation above, you get 



(4Cxe 2x + 4Ce 2x ) - 3(2Cxe 2x + Ce 2x ) + 2Cxe 2x = le 2x . 

The terms in xe 2x cancel completely, and you’re left with Ce 2x = 7e 2x . So 
(7 = 7, meaning that yp = lxe 2x . Finally, the complete solution is given by 
y = Vh ~\~yp = Ae x + Be 2x H- lxe 2x . 

One more example. If you want to solve 

y" + Qy' + 9y = 


you’ll have to go even further than before. Now the homogeneous equation 
y" 6y f 9y = 0 has characteristic quadratic t 2 + + 9 = (t + 3) 2 , so 
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the homogeneous solution is yn = Ae~ Sx + Bxe~ 3x . Since the right-hand 
side of the differential equation is e~ Sx , we’d want to take yp = Ce~ Sx . 
That won’t work, since it’s included in yn (with A = C and B = 0). Even 
yp = Cxe~ Sx won’t work, since that’s also included in yn (with A = 0 and 
B = C). So we have to go all the way up to x 2 and set yp = Cx 2 e~ Sx . 

S Now you can differentiate twice to see that y r P = 2Cx~ Sx — 3Cx 2 e~ Sx and 
y" P = 2Ce~ Sx - 12Cxe~ Sx + 9Cx 2 e~ Sx (check this!). I leave it to you to plug 
/ these quantities into the original equation and show that it all simplifies to 
2Ce~ Sx = e~ Sx . This means that C = |, so the solution to the differential 
equation is y = yn ~\~yp = Ae~ Sx 4 - Bxe~ Sx + ^x 2 e~ Sx for some constants A 
and B. 

30.4.8 Initial value problems (constant-coefficient linear) 

Let’s see how to deal with initial-value problems (IVPs) involving constant- 
coefficient linear differential equations. As usual, to solve an IVP, first solve 
the differential equation, then use the initial conditions to find the remaining 
unknown constants. 

Let’s modify the last two examples from Section 30.4.6 above to make 
them into IVPs, then solve them. For the first example, suppose you are 
given that y f 2y = 4x ^ sin(5$)，and that y(0) = —1. Well, ignoring the 
condition y(0) = —1 for the moment, we already saw that the general solution 


◎ 


y = Ae~ 


5 2 

- \-2x-l - — cos(5a:) + — sin(5a:). 
o7 o7 

which means that when x = 0, y = —^ 


Now we also know that y(0 )=—] 

Substituting this in, we get 

-1 = Ae° + 2(0) - 1 - ^ cos(0) + ^ sin(0) = j 一 1 - •• 
This reduces to A = 5/87, so the solution to the IVP is 


◎ 


87 


-\-2x- 


5 2 

gy cos(5a:) + — sin(5a:). 


There are no unknown constants. 

To modify the second example, let’s suppose that y" — by r + 6y = 2x 2 e x 
and that y(0) = y 7 (0) = 0. As we saw in Section 30.4.6, the general solution 
(ignoring the initial conditions y(0) = 0 and y’(0) = 0) is given by 


y = Ae 2x + Be 3 * + (x 2 + 3a; +【) 
mta^ 

y' = 2Ae 2x + 3Be 3x + (^x 2 + 5a; + 导 ) e x . 

So, when x = 0, we know that both y and y r are equal to 0; substituting into 
the equation for y gives 

0 = 4e 0 + Be 0 + (0 2 + 3(0) + ^je° = A + B+^, 


We’ll need to differentiate this once to take advantage of the fact that we 
know what y’(0) is; check that 
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whereas substituting into the equation for y' gives 

0 = 2Ae° + 3Be 0 + (0 2 + 5(0) + y) e° = 2A + 3B+y. 

Solving these equations simultaneously, we get A = —4 and B = \. This 
means that the solution to the IVP is 

y = -4e 2x + 臺 e 3x + (a ; 2 + 3® + ■) e' 




Notice that in both examples there are no constants left: the initial conditions 
have allowed us to home in on the unique solution. Without initial conditions, 
there will always be one or two unknown constants. 

Let’s look at one last IVP example. Suppose that 

y" + 6y f + 13y = 26x s - 3x 2 - 24% 2/(0) = 1, y’ ⑼ = 2. 

The homogeneous equation is y /f -\-6y f -\-13y = 0, with characteristic quadratic 
equation 亡 2 + + 13 = 0. Using the quadratic formula, the solutions to this 

last equation are t = (—6 ± v^36 — 4 - 13)/2 = —3 士 2i. This means that 
Uh = e~ Sx (Acos(2x) + Bsin(2x)). Turning now to the particular solution: 
since the right-hand side (the x-stuS) of the original equation is a cubic, we 
should write down the form yp = ax 3 + bx 2 -\-cx-\-d. Now we have to find the 
constants a through d by substituting yp into the differential equation. Note 
that y' P = 3ax 2 + 2bx + c and yp = 6ax + 2b. Substituting, we get 

(6ax + 26) + 6 (Sax 2 + 2bx + c) + 13 (ax 3 + bx 2 -\-cx-\-d) = 26x s — 3x 2 — 24a:. 

Equating coefficients (just as we did for partial fractions) for x s , x 2 , and 1, 
we get 13a = 26, 18a-I-136 = —3, 6a+126+13c = —24, and 26 + 6c+13d = 0, 
respectively. I leave it to you to solve these equations and see that a = 2, 
b = —3, c = 0, and d = 6/13. So yp = 2x s — 3x 2 + 6/13, and therefore 

y = Vh + yp = e~ Zx {A cos(2o:) + B sin(2a:)) + 2x s — 3x 2 + ^ 


for some constants A and B. Now, to find these constants, let’s use the initial 
conditions. Since y(0) = 1, we know that y = 1 when x = 0; substituting, we 
have 

1= cos(0) + B sin(O)) + 2(0) 3 - 3(0) 2 + 


so ^4 = 7/13. Meanwhile, differentiating the expression for y gives 


y’ = e~ Sx (-2A sin(2x)-\-2B cos(2a:)) — 3e~ Sx (A cos(2x)-\-B sm(2x))-\-6x 2 —6x. 


Now, since y f (0) = 2, we know that y r = 2 when a: = 0; substituting this into 
the above expression for y’，we get 

2 = e°(-2Asin(0) + 2Bcos ⑼卜 3e°(Acos(0) + Bsin ⑼）十 6(0) 2 - 6(0) 

= 2B- 3^1. 
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Since A = 7/13, we can solve this last equation to find that B = 47/26. Now 
we plug these values in to find the final answer: 

V = e_3x ( 丟 cos(2a;) + ^ sin(2a;)) + 2a; 3 - 3a; 2 + —. 

Once again, note that there are no constants involved here: the initial condi¬ 
tions (that is, the values of y(0) and y’(0)) pinpoint the explicit solution. 


30.5 Modeling Using DifferemSal Equations 


◎ 


Many quantities in the real world can be modeled (that is, theoretically 
approximated) by differential equations. Examples include heat flow, wave 
height, inflation, current in electrical circuits, and population growth, to name 
a few. Here’s a simple example of a somewhat realistic situation involving 
population growth. 

A certain culture of bacteria grows exponentially in such a way that its 
instantaneous hourly rate of increase is equal to twice the number of bacteria 
in the culture. Suppose that an antibiotic is continuously introduced into the 
culture at the constant rate of 8 ounces per hour. Each ounce of antibiotic 
present kills 25,000 of the bacteria per hour. What is the minimum initial 
population of bacteria that need to be present in order to ensure that the 
culture is never completely wiped out? 

The idea here is that the number of bacteria is increasing as they breed, 
but the amount of killer antibiotic is increasing too as it gets pumped into the 
petri dish. Which one wins, the bacteria or the antibiotic? To find out, we 
need to write down a differential equation that models the situation. In effect, 
we have to translate the word problem into a differential equation. If there 
were no antibiotic, you’d have the standard population growth differential 
equation with k = 2: 

f = -’ 

where P is the population at time t hours. (We looked at this sort of thing in 
Section 9.6.1 of Chapter 9.) Now we have to modify this to take the antibiotic 
into account. At time t hours, we know there are St ounces of antibiotic 
present, so the death rate due to this amount present is 8 亡 x 25000 = 2000001 
The correct differential equation is therefore 


2P- 200000^. 


This can be rearranged into standard form as 
dP 
dt 


2P ： 


200000t. 


The integrating factor (see Section 30.3 above) for this first-order linear equa¬ 
tion is e 卜 2 dt , which simplifies to e~ 2t . Multiplying the equation by the 
integrating factor, we get 

e~ 2t ^- - 2e~ 2t P = -200000e~ 2t t. 
at 
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As usual, the left-hand side simplifies to the derivative of P times the inte¬ 
grating factor: 

^(e~ 2t P) = -200000e- 2 ^, 


or just 


e~ 2t P = -200000 / e~ 2t tdt. 


The right-hand side needs to be integrated by parts (see Section 18.2 in Chap¬ 
ter 18); I leave it to you to show that 


e~ 2t P = 100000te~ 2t + 50000e _2t + 200000C. 


Now we can replace the arbitrary constant 200000C by the equally arbitrary 
constant C and multiply through by e 2t to get 

P = lOOOOOt + 50000 + Ce 2t . 


This is the equation for the population of bacteria at time t. If the initial 
population is Po, then we can set i = 0 in the equation to get 

P 0 = 100000(0) + 50000 + Ce 2 ( 0 ) = 50000 + C. 

This means that C = Pq — 50000, so we can insert that into the equation and 
get 

P= lOOOOOt + 50000 + (P 0 - 50000)e 2t . 


Great! So we know a lot about the situation. We still have to answer the 
question. For what values of Po will we ever get down to a population of 0? 
It seems that 50000 is a pretty critical number. Indeed, if Po = 50000, the 
above equation is just P = 100000^ H- 50000; in that case, the bacteria start 
at 50000 and grow at a constant rate of 100000 per hour, so the population 
never dies. If Po > 50000, then you add a positive multiple of e 2t to this and 
so the population grows even faster. How about if Po < 50000? Then the 
quantity Pq — 50000 is negative, so we have 


P = 100000#+ 50000 + (negative constant)e 2t . 


Since exponentials eventually dominate, it’s a sure thing that if t is large 
enough, P will eventually get down to 0. For example, even if the initial 
population is 49999, then we have 

P= 100000f +50000-e 2 *. 

Here’s the graph of P versus t in this case: 
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You can see from the graph that the population grows almost linearly for the 
first 5 hours, then there is a rapid turnaround, and finally the population 
hits 0 some time between 6.5 and 7 hours. (Of course, once it hits 0, that’s 
the end of the story ― the population never goes below 0, since you can’t 
have a negative population! So the above graph doesn’t accurately reflect the 
situation when P < 0.) In general, we conclude that if the initial population 
is under 50000, the bacteria will die out, whereas if it is 50000 or more, the 
culture will survive; in fact, it will always grow in that case. 











APPENDIX A 

Limits and Proofs 


Throughout this book we have used limits extensively, in their own right and 
also as an essential part of the definitions of the derivative and the integral. 
Since limits are so important, it’s about time that we define them properly. 
Once we know how they work, we can prove a number of facts that we’ve been 
taking for granted. So, here’s what’s in this appendix: 

• the formal definition of a limit (including left-hand and right-hand limits, 
infinite limits, limits at 士 oo, and limits of sequences); 

• combining limits, and a proof of the sandwich principle; 

• the relationship between continuity and limits, including a proof of the 
Intermediate Value Theorem; 

• differentiation and limits, including proofs of the product, quotient, and 
chain rules; 

• a proof of a result concerning piecewise-defined functions and derivatives; 

• a proof of the existence of e; 

• proofs of the Extreme Value Theorem, Rolle’s Theorem, the Mean Value 
Theorem (for derivatives), the formula for the error term in linearization, 
and PHopitaPs Rule; and 

• a proof of the Taylor approximation theorem. 


A‘ 1 formdf tfe,nition of a Limit 

We start with a function / and a real number a. In Section 3.1 of Chapter 3 
we introduced the notation 


lim f(x) = L, 

x—^a 

which is used throughout this book. Intuitively, the above equation means 
that when x is close to a, the values of f(x) get very very close to L. How 
close? As close as you want them to be. To see what this means, let’s play a 
little game, you and I. 















Section A.1.1: A little game • 671 


I could have taken away more and it would still have been fine — as long as 
what’s left is between your lines. 

Now it’s your move again. You have realized that my task is harder when 
your lines are closer together, so this time you pick a smaller value of e. Here’s 
the situation after your second move: 



Parts of the curve are outside the horizontal lines again, but I haven’t had my 
second move yet. I’m going to throw away more of the function away from 
x = a, like this: 



So once again I was able to make a move to counter your move. 

When does the game stop? Hopefully, the answer is never! If I can always 
move, no matter how close together you make the lines, then it will indeed 
be true that Jim a /(a:) = L. We will have zoomed in and in, you pushing your 
lines closer together, I responding by focusing only on the part of the function 
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close enough to x = a. On the other hand, if I ever get stuck for a move, then 
it’s not true that \im a f(x) = L. The limit might be something else, or it may 
not exist, but it’s definitely not L. 

A.l .2 The actual definition 

We need to turn the game into some more symbols. First, notice the interval 
you choose is {L — e^L-\-e). In fact, you can also think of that interval as the 
set of points y satisfying \y — L\ < e. Why? Because \y — L\ is the distance 
between y and L on a number line (such as the y-axis). So your interval 
consists of all the points which are less than e away from L. As you might 
guess, it will be incredibly useful for you to be able to convert an inequality 
like \y — L\ < e into its equivalent form L — e<y<L~\~e and back again. 

Now it’s my move. I always need the function to lie in your interval. This 
means that after I’ve thrown away a lot of the domain, all the remaining 
values of f(x) must be less than e away from L. So after my move, you’ll 
have to conclude that 


|/(a:) — L\<e for all x ^ a which are close enough to a. 

To be more precise about my move, notice that I am throwing away everything 
except in an interval centered at a. My interval looks like (a — 5, a + 5) for 
some other number 5, so I can also think of it as all the numbers x such 
that \x — a\ < S. In fact, since I don’t want x to be equal to a, I can write 
0 < |a: — a| < 5. 

In summary, then, your move consists of picking e > 0. (It had better 
be positive or else there’s no tolerance window at all!) My move consists of 
picking a number 5 > 0 such that 

\f{x) — L\ < e for all x satisfying 0 < |a: — a| < 5. 

Remember, this means that whenever x is no more than a distance 8 away 
from a (except for a itself), the value of f(x) is no more than a distance of e 
away from L. This quantifies the idea that f(x) is close to L when x is close 
to a. Now all that’s left is to allow you to make your choice of e as small 
as you like, and I still have to pick the S accordingly. So, here’s the formal 
definition we’re looking for: 


u hm a f(x) = L v means that for any choice of £： > 0 
you make, I can pick <5 > 0 such that: 

\f(x) — L\ < € for all x satisfying 0 < |a: — a| < 5. 


It’s very important that I get to move after you do! My choice of S depends 
on your choice of e. Normally I can’t make a universal choice of S that works 
for every £： > 0. I just have to adapt to your choice. 


A.l.3 
◎ 


Examples of usin^/ttw deffn itioVi 

As a simple example, let’s show, without using continuity, that 


lim x 2 =! 

a:—3 
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Tempting as it is to write 3 2 = 9 and declare victory, that doesn’t work 
because the limit only depends on what happens when x is near, but not 
equal to, 3. So, we have to play our little game. You choose your e > 0, which 
makes a little window (9 — £：, 9 + £：) that I have to stay within. Now I get 
to pick my S. Suppose that your e is 8, which is humongous in this context. 
Then your window is (1,17). Well, I can easily stay in there by choosing my 
5 = 1， which means that my window is (2,4). (Remember, my window is 
centered at 3, while yours is centered at 9.) Indeed, if you square any number 
between 2 and 4, you get a number between 4 and 16, so my move is fine. If 
your e is even bigger than 8, well, that just widens your interval, but I’ll stick 
with my S = 1 and be just fine. 

Now, if you choose your tolerance e less than 8,1 have to change my tactic. 
My choice in this case will be … drumroll ... 8 = e/S. That is, I’m making 
my window eight times smaller than yours, no matter how wide you choose 
it. To see that this works, we have to be clever. Basically, we have to take 
any number in my interval, square it, and show that it lies in your interval. 
My interval is (3 — e/8,3 + e/S) and yours is (9 —e, 9 + £：). 

So let’s pick x in my interval. How big could it be? It’s got to be less 
than 3 + s/S. That is, a: < 3 + e/S, which you can also write as a; — 3 < e/S. 
By the way, since your e is less than 8, my x is less than 4. So, using both 
inequalities x — 3 < e/S and x < 4, we get 

(x - 3)(a: + 3) < (!) (4 + 3)= 譬 . 

Since (x — 3)($ + 3) is just x 2 —9, we can add 9 to both sides of the equation 
and see that 

x 2 <9+ 譬 . 

So we’re OK on the upper tolerance level (the upper of your two lines). We 
needed x 2 < 9 + e, and we have done that. How about the lower one? Well, 
how small could my x be, given that it lies in my interval (3 — e/8,3 + e/8)? 
It’s got to be bigger than 3 — e/8, so we have a: > 3 — s/S. This means that 
a: — 3 > —e/S. Since your e is less than 8, we also have a: — 3 > —8/8 = —1, 
which means that x > 2. Again, using both inequalities x — S > —e/8 and 
x > 2, we get 

{ x — 3)($ + 3) > ( _ 互 ) (2 + 3)=——. 

Once again, (pc — 3)($ + 3) = x 2 — 9, so we add 9 to both sides and get 

x 2 >9 - 誓 . 

This takes care of the lower tolerance level! We have shown that if x lies in 
the interval (3 — s/8,3 + £ ： /8), then x 2 is in the interval (9 — 5e/8,9 + 7e/8). 
Since both 5/8 and 7/8 are less than 1, we can also confidently say that x 2 is 
in the interval (9 — e, 9 + e); after all, this interval contains the other one. 
Tying it all together, let’s set f(x) = x 2 , and we’ll justify the equation 

lim/(a ； ) = 9. 
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You choose e, and I respond by picking S = s/8 unless your e is 8 or more, 
in which case I just pick 5=1. We have shown that in either case, if x is in 
the interval (3 — 5,3 + 5), then f(x) is in the interval (9 — e, 9 + e). In other 
words, whenever \x — 3\ < S, then \f(x) — 9| < e. We can also exclude a; = 3 
if we like and say that if 0 < |a: — 3| < 5, then \f(x) — 9| < £ :. This is exactly 
what we need — we have justified our equation. Believe it or not, that’s pretty 
much what you have to do if you want to prove that the above limit is true 
by using the definition! 

Aj: Making New Limits: from Old Ones 

That last example was pretty annoying. Just to show that a; 2 — 9 as a: — 3, 
we had to do a lot of work. Luckily it turns out that once you know a couple 
of limits, you can put them together and get a whole bunch of new ones. For 
example, you can add, subtract, multiply, and divide limits within reason, 
and there’s also the sandwich principle. Let’s see why all this is true. 

A.2.1 Sums>arid : 

Suppose that we have two functions / and g, and we know that as a: — a, we 
have f(x) L and g(x) M. What should happen to f(x)-\-g(x) as x ^ a? 
Intuitively, it should tend to Let’s prove this using the definition. So, 

we know that 


lim fix) = L and lim g(x) = M. 

x—^a x—^a 

This means that if you pick e > 0， I can ensure that \f(x) — L\ < e by- 
restricting x close enough to a. I can also ensure that \g(x) — M\ < e \i x 
is close enough to a. The degrees of closeness that I need might be different 
for / and 仏 but it doesn’t matter — I can just go close enough so that both 
inequalities work. 

Now, if f(pc)-\-g{x) is close to L+M, this means that the difference between 
these things should be small. So we’ll need to worry about the quantity 
+ g(x)) — (L + M)\. We’ll write this as \(f{x) — L ) 十 (g(x) — M)\. We 
can then use the so-called triangle inequality, which says* that \a-\-b\ < |a| + |6| 
for any numbers a and 6, as follows: 

R/ ⑷ -L) + (g(x) -M)\< lf{x) -L| + ㈣ -M\<e + e = 2e, 

provided that x is close enough to a. This is almost good enough, except 
that you wanted a tolerance of €, not 2e\ So I have to make my move again 


* Since we’re proving stuff, here’s a proof of the triangle inequality. We start off with 
the observation that x < \x\ for any number x. Indeed, if x is positive or 0, then x = |x|; 
otherwise, the left-hand side is negative and the right-hand side is positive. Replace x by 
ab to get ab < \ab\ = |a| - |6|. Now multiply this by 2 and add a 2 + b 2 to both sides. We 
get a 2 + 6 2 4 - 2ab < a 2 + 6 2 + 2|a| • |6|. The left-hand side is just (a + b) 2 . Since x 2 = \x\ 2 
for any x, we can replace the left-hand side by \a + b\ 2 . Similarly, on the right we have 
|a| 2 + |6| 2 +2|a|-|6|, or just (|a| + |6|) 2 . Altogether then, our inequality is |a+6| 2 < (|a| + |6|) 2 . 
Now we just have to take square roots and we’re done, since \a + b\ and |a| + |6| are both 
nonnegative. 
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(sorry about that); this time I’ll narrow my focus so that both \f(x) — L\ and 
\g(x) — M\ are less than e/2 instead of e. This is no problem, since I can 
deal with any positive number that you pick. Anyway, if you redo the above 
equation, you’ll get e on the right instead of e/2, so we have proven that I 
can find a little window about x = a such that 


\(f(x)+g(x)) ~(L + M)\<e 

whenever x is in my window. (You can use S if you like to describe the 
window better, but that doesn’t really get us anything extra.) So this proves 
the following: 

if lim f(x) = L and lim g{pc) = M, then = L + M. 

x—^a x—^a x—^a 


That is, the limit of the sum is the sum of the limits. Another way of writing 
this is 

+ g(x)) = lim f(x) + lim g(x), 

x—^a x—^a x—^a 

but here you have to be careful to check that both limits on the right exist 
and are finite. If either limit is 士 oo or doesn’t exist, the deal’s off. Both 
limits have to be finite to guarantee that you can add them up. You might 
get lucky if they’re not, but there’s no guarantee. 

How about f(x) — g{x)l That should go to L — M, and it does: 

if lim f(x) = L and lim g(x) = M, then lim (f(x) — g(x)) = L — M. 



The proof is almost identical to the one we just looked at, except that you 
need a slightly different form of the triangle inequality: \a — b\ < \a\ + |6|. 
Actually, this is just the triangle inequality applied to a and —6; indeed, 
|a + (—6)| < \a\ + |—6|, but of course |—6| is equal to |6|. I leave it to you to 
rewrite the above argument but change the plus signs between f(x) and g(x), 
and between L and M, into minus signs. 


A.2.2 喻 «#]imifS-ipC)Gf v 

Now we once again assume that we have two functions / and g such that 
lim fix) = L and lim g(x) = M. 

x—^a x—^a 

We want to show that 

lim f(x)g(x) = LM. 

That is, the limit of the product is the product of the limits. Another way of 
writing this is 

lim f(x)g(x) = lim f(x) x lim g(x), 

x—^a x—*^a x—^a 

again with the understanding that both limits on the right-hand side are 
already known to exist and be finite. To prove this, we need to show that 
the difference between f(x)g(x) and the (hopeful) limit LM is small. Let’s 
consider that difference f(x)g(x) — LM. The trick is to subtract Lg(x) and 
add it back on again! That is, 


f{x)g{x) - LM = f(x)g(x) - Lg(x) + Lg(pc) - LM. 
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What does that get us? Let’s take absolute values, then use the triangle 



\f(x)g(x) - LM\ = ⑻ - L)g(x) + L(g(x) - M)\ 

< 1(/ ⑷ - L)g(x)\ + \L(g(x)-M)\. 

We can tidy this up a little and write 

\f{x)g{x) - LM\ < \f(x) - L\ - |ff(a;)| + \L\ - \g(x) - M\. 

Now it’s time to play the game. You pick your positive number e and then 
I get to work. I concentrate on an interval around x = a so small that 
|/(a:) — L\ < e and \g(x) — M\ < e. In fact, if you pick e > 1 (a pretty 
feeble move, if you ask me — you want e to be small!) then I’m even going 
to insist that \g(x) — M\ < 1 in that case. So we know in either case that 
\g(x) — M\ < 1, which means that M — 1 < g(x) < M + 1 on my interval. In 
particular, we can see that \g(x)\ < \M\ + 1. The whole point is that we have 
some nice inequalities on my interval: 

\f(x) -L\<e, \g{x)\ < \M\ + 1, and \g{x) -M\<e. 

We can insert these into the inequality for \f{x)g(x) — LM\ above: 
\f(x)g(x)-LM\< |/0r) - L| • | 越雜 | • \g(x) - M\ 

<s (|M| + l) + |L|.e = e(|M| + |L| + 1) 



for x close enough to a. That’s almost what I want! I was supposed to get 
e on the right-hand side, but I got an extra factor of (|M| + \L\ + 1). This 
is no problem — you just have to allow me to make my move again, but this 
time I’ll make sure that \f(x) — L\ is no more than e/(\M\ + \L\ + 1) and 
similarly for \g(x) — M\. Then when I replay all the steps, £ will be replaced 
by s/(\M\ + \L\ + 1), and at the very last step, the factor (|M| + \L\ + 1) will 
cancel out and we’ll just get our e! So we have proved the result. 

By the way, it’s worth noting a special case of the above. If c is constant, 
then 

lim cf(x) = c lim f(x). 

x—^a x—^a 

This is easy to see by setting g(x) = cm our main formula above; I leave the 
details to you. 


A.2.3 Quofi©p1| 嫩 f 

Now we repeat our exercise. We want to show that if 
lim f(x) = L and lim g(pc) - 

x — ^cl x — ’ 

then we have 


M, 


^ag{x) 


f{x) _ L 

—M' 

So the limit of the quotient is the quotient of the limits. For this to work, 
we’d better have M — 0 or else we’ll be dividing by 0. Another way of writing 
the above equation is 

Um M_ = 

一 9{x) Jim 5 (a;) 5 
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provided that both limits exist and are finite, and that the ^-limit is nonzero. 

Here’s how the proof goes. We want f(x)/g(x) to be close to L/M, so 
we consider the difference. Then we’ll need to take a common denominator, 
leaving us with 

f(x) _ L_ _ Mf(x) - Lg(x) 
g{x) M Mg{x )' 

Now we do a trick similar to the one we used in for products of limits: we’ll 
subtract and add LM to the numerator, then factor. This gives us 

f{x) L _ Mf(x) - LM + LM - Lg(x) 
g(x) M Mg{x) 

M{f{x)-L) L{M-g{x)) 

~ Mg{x) Mg{x) 

_ f(x)-L L(g(x)-M) 

9{x) Mg{x )' 

If we take absolute values and then use the triangle inequality in the form 
\a-b\< |a| + |6|, we get 

\m L\_\f(x)-L L{g{x)-M)\\f{x)-L\ \L(g(x)-M)\ 

\ 9 (x) M\\ g{x) Mg{x) \ - \ g(x) \ \ Mg{x) \' 

So you make your move by picking £： > 0, and then I narrow the window 
of interest around x = a so that \f(x) — L\ < e and \g(x) — M\ < s in the 
little window. Now I need to be even trickier, though. You see, I know that 
M — e < g(x) < M + e, which means that \g{x)\ > \M\ — e. All’s well if 
this right-hand quantity \M\ — £： is positive, but if it’s negative, it tells us 
nothing since we already knew that | 分 ($)| can’t be negative. So if your e is 
small enough, then I don’t worry, but if it’s a little bigger, I need to narrow 
my window more so that \g(x)\ > |M|/2 on the window. So altogether we 
have three inequalities which are true on the little interval: 

\f(x)~L\ <£, \g(x)\ > and |g(-aj) - M\ < e. 

This middle inequality can be inverted to read 

1 2 

W)\ < \M\' 

Putting everything together, we have 

\m L\^ \f(x)-L\ \L\.\g(x)-M\ _ J_ , , M J_ 

\9{x) M\- | ff (x)| + IMII^)! < _|M|+ ' \M\ ' \M[ 

Not quite what we wanted — we have an extra factor of (2/|M| + 2|L|/|M| 2 ), 
but we know how to handle this — I just make my move again, but instead of 
your e, I use e divided by this extra factor. 
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A,2.4 The sandwich principle—proof 

In Section 3.6, we looked at the sandwich principle. Now it’s time to prove 
it. We start with functions f,g, and /i, such that g(x) < f(x) < h(x) for all 
x close enough to a. We also know that 

lim g(x) = L and lim h(x) = L. 

x—^a x—^a 

Intuitively, / is squished between g and h more and more, so that in the limit 
a,s x ^ a, we should have f(x) L as well. That is, we need to prove that 

lim f(x) = L. 

Well, you start off by picking your positive number e, and then I can focus on 
an interval centered at a small enough so that \g(x) — L\ < e and \h(x) — L\ < e 
on the interval. I’m also going to need the inequality g(x) < f (x) < h(x) to 
be true on the interval; since that inequality might only be true when x is 
very near to a, I may have to shrink my original interval. 

Anyway, we know that \h(x) — L\ < e when x is close enough to a; the 
inequality can be rewritten as 

L — e < h(pc) < L-\-e. 

Actually, we only need the right-hand inequality, h(x) < L + e; you see, on 
my little interval, we know that f(x) < h(x), so we also have 

f(x) < h(x) < L-he. 

Similarly, we know that 

L — e < g(x) < L-\-e 

when x is close enough to a; this time we throw away the right-hand inequality 
and use g(x) < f(x) to get 


L-e< g{x) < f(x). 

Putting all this together, we have shown that when x is close to a, we have 
L — s < f(x) < L 

or simply \f(x) — L\ < e. That’s what we need to show our limit — we’ve 
proved the sandwich principle! 

A.3 Other Varieties of Limit!- 

Now let’s quickly look at the definitions of some other types of limits: infinite 
limits, left-hand and right-hand limits, and limits at 士 oo. 
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A.3.1 Infinite limits 

Our game isn’t going to work if we want to use it to define a limit like this: 
lim f(x) = oo. 

When you try to draw your two lines close to the limit, you’ll be completely 
stuck, since the limit is supposed to be oo instead of some finite value L. So we 
have to modify the rules a little bit. My move won’t change much, but yours 
will. Instead of picking a little number € and then drawing two horizontal lines 
(at height L — s and L + e), this time you’ll pick a large number M and only 
draw in the line at height M. I still make my move by throwing away most 
of the function, except for a small bit around x = a\ this time, though, I have 
to make sure that what’s left is always above your line. For example, the 
following pictures show a move you might make and then a possible response 
for me: 


M 


a 




my move 


Now here’s what happens if you make another move but with a larger value 
of M: 



my move 


So the idea is that this time you raise your bar higher and higher; if I can 
always make a move in response, then the limit is indeed oo. In symbols, I 
need to be able to ensure that f(x) > M whenever x is close enough to a, no 
matter how big M is. The definition looks like this: 


u \ijp- a f(x) = oo” means that for any choice of M > 0 
you make, I can pick <5 > 0 such that: 

f(x) > M for all x satisfying 0 < |a: — a| < 5. 
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It’s very similar to the situation when the limit is some finite number L, except 
that the inequality \f(x) — L\ < e is replaced by f(x) > M. 

For example, suppose that we want to show that 

lim -^7 = oo. 


You start off by picking your number M; then I have to make sure that 
f(x) > M when x is close enough to 0. Well, suppose that I throw everything 
away except for x satisfying \x\ < 1/V^M. For such an x, we have x 2 < 1/M, 
so 1/x 2 > M (note that we have assumed that x ^ 0). That means that 
f(x) > M in my interval, which means my move is valid. So for any M you 
pick, I can make a valid move, and we have proved that the limit is indeed 
oo. 

How about —oo? Everything is just reversed. You still pick a large positive 
number M, but this time I need to make my move so that the function is 
always below the horizontal line of height —M. So here’s what the definition 
looks like: 


= —oo” means that for any choice of M > 0 
you make, I can pick <5 > 0 such that: 

f(x) < —M for all x satisfying 0 < \x — a\ < S. 


A...3/2 Left-hand a 糊 right-hand limits. 

To define a right-hand limit, we play the same game, except this time before 
we start, we already throw away everything to the left oi x = a. The effect is 
that instead of choosing an interval like (a — 5, a + <5) when I make my move, 
now I just have to worry about (a, a-\-S). Nothing to the left of a is relevant. 

Similarly, for a left-hand limit, only the values of x to the left of a matter. 
This means that my intervals look like (a—5, a); I have thrown away everything 
to the right oi x = a. 

This all means that you can take any of the above definitions in boxes and 
change the inequality 0 < \x — a\ <5to0<a: — a<5to get the right-hand 
limit. To get the left-hand limit, you change the inequality toO<a — 
instead. I’ll spare you the gory details of writing out all six versions (that’s 
each of the limits with values L, oo, and —oo in both left-hand and right-hand 
versions) but it’s not a bad exercise for you to try to do it without looking at 
these pages. 

A.3.3 ； ： at oo anfe^oo 


Our final variety of limit occurs when the limit is taken at oo or —oo instead 
of at some finite value a. So we want to define what the following equation 
means: 

lim f(pc) = L. 

The game has to change a little, of course, but we already know how. In 
fact we just have to adapt the methods from Section A.3.1 above. You’ll 
start by picking your little number s > 0, establishing your tolerance interval 
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(L — £：,L + £：); then my move will be to throw away the function to the left 
of some vertical line x = N, so that all the function values to the right of the 
line lie in your tolerance interval. Then you pick a smaller e, and I move the 
line rightward if I have to in order to lie within your new, smaller interval. 
Here’s what the first couple of moves for both of us might look like: 




After your first move, my move ensures that all the function values to the 
right of the line x = N lie in your tolerance interval. You respond by closing 
in the interval, but then I just move the line to the right until I can meet your 
new, more restrictive tolerance interval. Again, if I can always make a move 
in response to you, then the above limit is true. 

More formally, my move consists of picking N such that f(x) is in the 
interval [L — e^L-\- e) whenever a: > TV (so a: is to the right of the vertical line 
x = N). Using absolute values, we can write this as follows: 


⑷ = L” means that for any choice of £： > 0 
you make, I can pick N such that: 

\f(x) — L\ < € for all x satisfying x > N. 


It’s worth noting that any limit as x —> oo is necessarily a left-hand limit 一 
there’s nothing to the right of oo! Anyway, there are still a couple of variations 
to look at. First, what doesjjm/h) = oo mean? You just have to adapt 
the previous definitions. In particular, you can take the above definition 
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and change your move to picking M > 0, and now instead of requiring that 
\f(x) — L\ < e, this changes to f(x) > M. If instead you would like to show 


that x lnn o /(a:) = —oo, you would change the inequality to f(x) < —M. Pretty 
straightforward. 


It’s also pretty easy to define what 
lim f(pc) = I/, lim fix) = oo, and lim fix) = —oo 

—OO X—^ — CX) X—^ — OO 


mean. The only thing that changes from the respective case where a: —> oo is 
that my vertical line will be at ar = —N, and now the function values have to 
lie in your tolerance region to the left of the line instead of to the right. That 
is, you just change the inequality x > N to x < —N in all the definitions. 

We can actually use the same idea to define the limit of an infinite se¬ 
quence. In Section 22.1 of Chapter 22, we gave an informal definition, but 
now we can do better. Start off with an infinite sequence ai, 勿，奶 ， …； then 


^lin^an = L” means that for any choice of £： > 0 
you make, I can pick N such that: 

\a n — L\ <e for all n satisfying n> N. 


If you compare this definition with that of 

lim f(x) = L 

x—^oo 

above, you’ll see that they are almost the same. The only difference is that 
the continuous variable x has been replaced by the integer-valued variable n. 
In the case that L is replaced by oo (or —oo), then you choose M > 0 instead 
of £： > 0, and the inequality \a n — L\ < e changes to a n > M (or a n < —M, 
respectively). 

Now if you really want a challenge, try writing out the definition of every 
possible type of limit (there are 18 that we’ve looked at!), and for an encore, 
see if you can prove analogues of all the results in Section A.2 above for the 
other cases. 

A.3.4 Two examples involving trig 

@ In Section 3.4 of Chapter 3, we claimed the following limit does not exist 
(DNE): 

lim sin(x). 

cc^oo 

The intuition is that sin(a:) keeps oscillating between —1 and 1, so it doesn’t 
tend to any one number. Let’s use the definition from Section A.3.3 above to 
prove that the intuition is correct. Suppose that the limit does exist and that 
it has the value L. You pick your number 6 ： > 0, and then I need to pick a 
large number N such that |sin(ar) — L\ < e whenever x > N. So let’s suppose 
you pick your £： to be |. This means that I need to ensure that |sin(a:) —L| < | 
whenever x > N • Another way of looking at this is that sin (a;) has to lie in 
the interval (L — + |) for all x > N. Unfortunately, this can’t happen, 

no matter what L and N are! To see why, just pick the first multiple of tt 
bigger than N; lefs say that this number is rnr for some integer n. Then 
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sin(n 7 r + 7 r/ 2 ) = 1 while sin(n 7 r + 3 丌 /2) = —1. These two values of sin(a:) are 
distance 2 apart, so they can’t both lie in the interval (L — + ^) since 

that interval is only 1 unit long. So the limit can’t be L for any finite number 
L. 

Here’s a picture of what’s going on for three potential candidates for our 
hopeful limit L: 




The width of the interval around L is ^ in each case, but in each of the three 
cases, I can’t cram sin(x) into the interval even if I throw a lot of it away. 
There’s no vertical line I can draw and state that to the right of that line, I 
am always in your interval, since sin($) keeps going out of the interval. The 
same is true no matter what horizontal stripe of height 1 we look at. 

To be completely diligent, we should also make sure that the limit can’t 
be oo or —oo. In fact, if the limit were oo, then you’d pick M > 0 and I’d 
have to make sure that sin(a;) > M whenever x > N for some N. All you 
have to do to thwart me, though, is to pick M = 2. Then I’m screwed, since 
sin (: r) > 2 is never true for any x\ The same move works for —oo (try it and 
see). So we have indeed shown that the above limit doesn’t exist. 

We also claimed, this time in Section 3.3, that 

lim sin 

x->0+ 




does not exist. To show that this is true, you can pick a potential limit L and 
argue as we did in the previous example. If your move is to pick £ = \^ then 
I need to try to pick S > 0 so that |sin(l/a:) — L\ < \ whenever 0 < a: < <5. 
(Here we are using the definition from Section A.3.2 above.) You can now be 
clever and try to find two tiny values of x that cause this to screw up. Indeed, 
if you try x = l/(nn + 丌 /2) and then x = 1 / (nn + 3 丌 /2) for large enough 
n, you will be within 0 < a: < 5, but sin(l/a;) will turn out to be 1 and — 1 , 
respectively; this is a problem, since both of them can’t lie in the tolerance 
interval (L — L + |) regardless of what L is. 

You should try writing out these details; but there is a simpler way. You 
see, since we already know that lim sin(x) doesn’t exist, we can just do a 
simple substitution of the limiting variable. Indeed, if you let u = 1/x, then 
x = 1/w, and we immediately know that 



does not exist. Now, when is it true that 1/u —> oo? The only way this can 
happen is if w — 0+. It’s not hard to justify this switcheroo in general (see 








Section A.4.1 below), so we see that 


DNE. 


lim sin 

14 — 0 + 

Now just change the dummy variable utox and we get what we want without 
any mess! 

A.4 Continuity and Limitf. 

As we saw in Section 5.1.1, to say that a function / is continuous at x = a 
means that 

lim f(x) = f(a). 

x—^a 

That is, when a: —> a, we have f(x) /(a). So the function f preserves lim¬ 
its; this is the essence of continuity. Anyway, we can now use our knowledge of 
limits to justify that when you add, subtract, multiply, or divide two functions 
which are both continuous at x = a, then the new function is also continuous 
there. (In the case of division, the denominator can’t be 0 at x = a.) Indeed, 
suppose that / and g are both continuous at x = a. Then we know that 

lim fix) = f(a) and lim g(x) = g(a). 

x—^a x—^a 

So to show that the function / + ^ is continuous at a; = a, all we have to do 
is split up the limit, which was justified in Section A.2.1 above: 

+ g(x)) = lim f(x) + lim g{x) = f(a)+g(a). 

x—>a x—^a x—^a 

That’s all there is to it. Now you can replace the + signs by —, x, or / signs 
to get the similar results for subtraction, multiplication, and division. 

A.4.1 Cortipo^itiofttsf continuous functtofls 

Let’s look at something a little trickier. Suppose that / and g are both con¬ 
tinuous everywhere; we want to show that the composition f o g is continuous 
too. We need to focus on one particular value of x to make this work. So let’s 
suppose that g is continuous dX x = a. Where do we need / to be continuous? 
We want to show that 

lim a f(g(x)) = f{g(a)). 

So it’s pointless to worry about whether / is continuous at x = a; we need it 
to be continuous at g(a) instead, since we are evaluating / near and at the 
point g(a). 

So, here’s the situation: we know that g is continuous at a: = a, and that 
f is continuous at = g(a), and we want to show that f o g is continuous 
at x = a. To do this, we need to add a third player to our game. I will 
actually play against this new player, who is called Smiddy, and Smiddy will 
play against you. 

Here’s how it works. Since / is continuous at g(a), we know that 

lim f(y) = f{g{a)). 

2/，⑷ 
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Note that I used y as a dummy variable instead of x, but that’s fine — you 
could change the y to any letter you please and it means the same thing. 
Anyway, let’s set L = f(g(a)). Then you pick your e > 0, establishing your 
tolerance interval (L — e,L + e), and you challenge Smiddy to throw away 
everything outside a little interval centered at y = g(a) in such a way that 
all the remaining function values lie in your interval. That is, Smiddy should 
pick A > 0 so that \f(y) — L\ < e whenever |y — 夕 (a)| < A. Because the above 
limit is true, Smiddy can do this. Why 入 instead of J? Because Smiddy’s cool 
like that. 

Now it’s my turn to play against Smiddy. This time, we use the fact that 
g is continuous at a: = a to write 

lim g(x) = g(a). 

Here’s the key: instead of €, which you already used, Smiddy just uses the 
number A! So Smiddy’s tolerance interval is {g(a) — X,g(a) + A). Now I have 
to throw away everything outside a little interval centered at x = a so that 
the remaining function values lie in Smiddy’s interval. Because the above 
limit is true, I can choose <5 > 0 such that whenever \x — a\ < <5, we have 
\g(x)-g(a)\ < A. 

All we have to do is put everything together. Because of my game with 
Smiddy, we know that whenever |a; — a| < S, we also have \g(x) — g{a)\ < A. 
Now your game with Smiddy shows that if |y — 沒 (a) | < A，then |/(y) — L\<e. 
Pushing Smiddy to one side and replacing L by f(g(a)) and y by g(x), we see 
that whenever |a; — a| < S, we have \ f(g(x)) — f(g(a))\ < e. This means that 
if I play against you directly, I can always make a legitimate move, no matter 
what e is (as long as it’s positive). So we have indeed shown that 

lim/( 5 (a:)) = /(ff(a)), 

provided that g is continuous at a and / is continuous at g(a). Of course, if 
f and g are continuous everywhere, then so is the composition function fog. 

The argument can be modified to include the cases where x ^ oo or 
x —oo instead of a. We have to make a slight change to the statement, 
since the right-hand side can’t be 沒 (oo). So the best we can do is as follows: 

and similarly for the case where x —> —oo. I leave it to you to write out 
the details of the proofs, but here’s the basic idea. Your game with Smiddy 
will be the same, but mine changes slightly: I pick N instead of S, and the 
inequality \x — a\ < 5 has to be replaced by x > N ot x < —N depending on 
whether you are in the case ofa: — ooora: — —oo. 

We can now establish the following limit, which appeared in Section 3.4 
of Chapter 3: 



Indeed, if you set f(x) = sin(a:) and g(x) = 1/x, then both / and g are 
continuous everywhere, except that g isn’t continuous at a: = 0. Since 

lim g(x) = lim — = 0, 

x—^oo x—^oo X 
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we can use our formula from above to conclude that 

lim sin ( —) = lim = f ( lim 〆$)) = /(0) = sin(O) = 0. 

X—^CX) J X—^OO \X—^00 J 

A more intuitive way of expressing this is that 1/x Oaso: — oo, so 
sin(l/a:) ^ sin(0) = 0 as a: — oo. 

A.4.2: Proof 0 fntefrf!^diateVala©Theorem 

In Section 5.1.4，we looked at the Intermediate Value Theorem, which says 
that if / is continuous on [a, b], and also f(a) < 0 and f{b) > 0, then there 
is some number c such that /(c) = 0. Now we’re going to look at the idea of 
the proof of this theorem. 

Consider the set of values x in the interval [a, 6] such that f(x) < 0. We 
know that a is in this set, since / ⑷ < 0, and that b isn’t in the set. We’d 
like to find the largest number c which is in the set, but that might not be 
possible. For example, what’s the largest number less than 0 itself? There 
isn’t one — for any negative number, you can always find a negative number 
closer to zero, for example, by dividing your number by 2. On the other hand, 
we can find a number c that is a sort of right-hand bookend of the set. In 
particular, we can insist that no member of the set is to the right of c, and 
also that any open interval with right-hand endpoint c includes at least one 
member of the set. (This is due to a nice property of the real line called 
completeness.) So here’s what we know, written in symbols: 

1. for any x > c, we have f(x) > 0; and 

2. for any interval (c — 5, c) where 5 > 0, there is at least one point x in 
the interval such that f(x) < 0. 

Now let’s get busy. Here’s the big question: what is /(c)? Suppose that it’s 
negative. In that case, c ★ b since f(b) > 0. Because / is continuous, the 
values of f(x) should be near /(c) when x is near c; this will be a problem 
when a: is a little to the right of c, because f(x) is supposed to be positive 
but /(c) is negative. More formally, you can choose e = —/(c)/2 (which is 
positive); then your tolerance interval is (3/(c)/2, /(c)/2), which consists only 
of negative numbers. I can’t pick any interval of the form (c — 5, c + 5) lying 
inside [a, 6] that works, since any such interval includes an x which is bigger 
than c. By condition #1 above, we know that f(x) would have to be positive, 
which means that it doesn’t lie in your tolerance region. So it can’t be true 
that /(c) < 0. Intuitively, if it is, then your bookend still has books to the 
right of it! 

Perhaps /(c) > 0. In this case, we can’t have c = a since f(a) <0. 
Now, the values of f(x) should be near /(c) when x is near c; so in particular 
they should be positive. This is a problem because of condition 替 2 above. 
Specifically, this time you can choose e = /(c)/2, so that your tolerance 
interval is (/(c)/2,3/(c)/2). I need to try to find an interval (c—5, c+5) within 
[a, b) such that for any x in my interval, f(x) always lies in your tolerance 
interval. In particular, f(x) > 0. This means that f(x) > 0 for all x in the 
interval (c — 5, c), which violates condition #2. So /(c) > 0 isn’t true either; 
if it were true, then the bookend could be pushed to the left some more, so it 
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wouldn’t be at c. 

What’s left? The only possibility is that /(c) = 0, so we have proved 
our theorem. By the way, it’s easy to change the situation to the case when 
f(a) > 0 and f(b) < 0 instead; you can either rewrite the proof slightly 
differently, or you can just set g(x) = —f(x) and apply the theorem to g 
instead of /. 

A.4.3 Proof of the Max-Min Theorem 

Now let’s prove the Max-Min Theorem, which we looked at in Section 5.1.6. 
The idea is that we once again have a function / which is continuous on the 
closed interval [a, 6]; the claim is that there is some number c in the interval 
which is a maximum for /. As we saw, this means that /(c) is greater than 
or equal to every other value of f(x) where x wanders over the whole interval 
[a, 6]. 

Here’s how it’s done. The first thing we want to show is that you can 
plonk down some horizontal line at y = AT, say, such that the function values 
f(x) all lie below that line. If you couldn’t do that, then the function would 
somehow grow bigger and bigger somewhere inside [a, b ], and it wouldn’t have 
a maximum. So, let’s suppose you can’t draw such a line. Then for every 
positive integer N, there’s some point xjv in [a, b] such that /(xn) is above 
the line y = N. That is, we have found some points xn such that /(xn) > N 
for every N. Let’s mark them on the a:-axis with an X. 

Now, where are these marked points? There are infinitely many. So if 
we chop the interval [a, b] in half to get two new intervals, one of them must 
still have infinitely many marked points. Perhaps they both do, but they 
can’t both have finitely many marked points or else the total would be finite. 
Let’s focus on the half of the original interval that has infinitely many marked 
points; if they both do, choose your favorite one (it doesn’t matter). Now 
repeat the exercise with the new, smaller interval: chop it in half. One of the 
halves must have infinitely many marked points. Continue doing this for as 
long as you like, and you will get a collection of intervals which get smaller and 
smaller, all nested inside each other, and each of which has infinitely many 
marked points. Stacking the intervals on top of each other, this is what the 
situation might look like: 


each segment is either the right half_ 

or the left half of the one below it 二 — 


infinitely many marked points 
lie below each segment 


Intuitively, there has to be some real number which is inside every single one 
of these intervals.* Let’s call the number q. What is / ⑷？ We can use the 


* Again, one needs to use the completeness property of the real line to show this. Actu¬ 
ally, there has to be exactly one such number — can you see why? 
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continuity of / to get some idea of what it should be. Indeed, we know that 

So if you pick your e to be 1, for example, then I should be able to find an 
interval (q — S, q 5) so that \ f(x) — f(q)\ < 1 for all x in the interval. The 
problem is that the interval (q — S,q S) contains infinitely many marked 
points! This is because eventually one of the little nested intervals that we 
chose will lie within (q — S,q S), no matter how small S is. This is a real 
problem: we are supposed to have all these marked points inside our interval 
{q — S,q-\- S), but when you take / of any of them, you get a number between 
f(q) — 1 and f(q) H- 1. So, no matter what f(q) is, we’re going to get in 
trouble: some of the marked points are going to have function values which 
are much bigger than f(q) + 1. The whole thing is out of control. So we were 
wrong about not being able to draw in a line like y = N which had the whole 
function beneath it! 

We’re still not done. We have this line y = N which lies above the graph 
of y = f(x) on [a, b] , but now we need to move it down until it hits the graph 
in order to find the maximum. So, let’s pick N as small as possible so that 
f(x) < N for all x in [a, 6]. (We have used completeness once again.) Now we 
need to show that N = f(c) for some c. To do this, we’re going to repeat the 
same trick as we did above with marked points, except this time they’ll be 
circled. Pick a positive integer n; we must be able to find some number c n in 
[a, 6] such that f(c n ) > N — 1/n. If not, then we should have drawn our line 
at y = AT — 1/n (or even lower) instead of y = N. So there is such a c n , and 
there’s one for every positive integer n. Circle all of these points. There are 
infinitely many of them, and when you apply / to them, the resulting values 
get closer and closer — arbitrarily close, in fact — to N. (None of the values can 
be bigger than N because f(x) < N for all a:!) Now all we have to do is keep 
bisecting the interval [a, 6] over and over again, such that each little interval 
has infinitely many circled points in it. As before, there is a number c in all 
the intervals. This number is really surrounded by a fog of circled points. 

What is /(c)? It can’t be more than N, but maybe it can be less than N. 
Let’s suppose that /(c) = M, where M < N, and let’s set e = (N — M)/2. 
Since / is continuous, we really need 

lim f(x) = f(c) = M. 

You have your s, and so I need to find an interval (c—S, c+5) so that f(x) lies 
in (M — e, M e) for x in my interval. The problem is that M e = N — e, 
and also that there are infinitely many circled points lying in (c — 5, c + 5)，no 
matter how I choose <5 > 0. Some of them might have function values lying in 
(M — e,M e), but since the function values get closer to N, most of them 
won’t. So I can’t make my move. The only way out is that /(c) = N after 
all. This means that c is a maximum, and we’re done! 

To get the minimum version of the theorem, just reapply the theorem to 
g(x) = After all, if c is a maximum for g, then it is a minimum for /. 
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Exponentials and Logarithms Revisited 


In Section 9.2 of Chapter 9, we developed the theory of exponentials and 
logarithms, culminating in the discovery that 

-^-e x = e x and -f- \n(x )=—. 
ax ax x 

There is one loose end: we claimed that 




exists, and called it e, but we never proved it. It’s possible to show directly 
that the above limit exists, but it’s not particularly informative. Instead, I’m 
going to assume that you’ve learned about integration and the Fundamental 
Theorems of Calculus (see Chapters 16 and 17) and take a different approach 
to the subject at hand. In fact, it all begins with logarithms. 

Let’s start by defining a function F by the rule 



for all a: > 0. This is a function based on the integral of another function; 
see Section 17.1 of Chapter 17 to remind yourself about this sort of function. 
Now, I know that you can just write 



=ln|a:| — ln|l| = ln(a:), 


since : c 〉 0 and ln(l) = 0. The problem is, we are jumping the gun! If we 
are really going to do this properly, we’re not allowed to use the fact that 
f 1/tdt = ln|t| + C. Actually, that’s one of the things we’re trying to show. 
So for the moment, we can’t assume that F(x) = ln(a;); let’s start by proving 
that. 

So let’s write down some interesting properties of this function F. The 
derivative of F is given by 



x 


by the First Fundamental Theorem of Calculus. So F is differentiable, which 
means that it’s continuous (see Section 5.2.11 of Chapter 5). Next, set a: = 1 
to see that 

F(l) = 乂 、 dt = 0, 

using the property that the integral of any function is 0 if both limits of inte¬ 
gration are equal and the function is actually defined there (see Section 16.3 
of Chapter 16). How about 


lim F(x)7 
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Actually, by the definition of the improper integral (as given in Section 20.2 
of Chapter 20)，we have 

r x i r°° i 

lim F(x) = lim / — dt = / — dt = oo. 

x-^-oo x-^-oo Jit Jl t 

We have to be really careful about saying that the improper integral 1/tdt 
diverges. When we originally proved this divergence, we used the formula 
f 1/tdt = ln | 刮 + C，but we’re not allowed to do this! Instead, the way to do 
it is to use the integral test to say that f^° 1/tdt and 1/n either both 
converge or diverge; then use the argument from Section 22.4.3 of Chapter 22 
to show that the series diverges; so the integral diverges too. So we have 

F(l) = 0 and lim F{x) = oo. 


Since F is continuous, the Intermediate Value Theorem (see Section 5.1.4 of 
Chapter 5) says that there must be a number e such that F(e) = 1. After all, 
1 is between 0 and oo! Also, since F r {x) = 1/x > 0 for all x > 0, we know 
that F is always increasing. So there can’t be any other number c such that 
F(c) = 1. We have arrived at our official definition of e: 


e is the 


—dt = 


Now let’s pick a rational number a, and define 


G(x) = F(x a ) 


— dt. 


We can use the Variation 2 technique described in Section 17.5.2 of Chapter 17 
to see that a 

G r (x) ^ [ \dt = ax a_1 — = a • 

dx Ji t x a x 

(This assumes that we know that ^(a? a ) = ax a_1 without using logarithmic 
differentiation; see if you can prove this fact for all rational numbers, knowing 
only that it’s true for positive integers, as we saw in Section 6.1 of Chapter 6.) 
On the other hand, we know that F f (x) = 1/x, so the above equation implies 
that G r {x) = aF f (x). Since a is constant, we see that G(x) = aF(x) + C, 
where C is constant. In particular, if we set x = 1, this equation becomes 
G ⑴ =aF(l) + C. Now G ⑴ =F(l a ) = F ⑴ = 0, so C = 0. Since 
G(x) = F(x a ), we’ve shown that F(x a ) = aF(x) for any rational number a 
and x > 0. In fact, since F is continuous, the same thing must be true for any 
real a at all! Now set x = e to see that F(e a ) = aF(e) = a, since F{e) = 1. 
Changing a to x, we have shown that F{e x ) = x. So F is the inverse function 
of e x , which means that F(x) = ln(a:). Since we know that F r (x) = 1/x, we 
have shown that ^ ln(x) = 1/x. Now ii y = e x , then x = ln(?/), so 


dx _ 1 _ 1 
dy y e x ’ 

by the chain rule, dy/dx = e x . So we’ve differentiated both ln(a:) and e x from 
scratch, and shown that e exists! 
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Now all we need to do is show that 


lim (1 + h) 1/h = e. 
h^o+ 


A.6 


This has become pretty easy: let y : 
Then 


㉟ + ln ⑼ H 


(1 + h) 1 ’、so that ln(y) = ln(l + h)/h. 
ln(l + ft) _ 


by the same argument we used in Section 9.4.3 of Chapter 9 (or just PHopitaPs 
Rule). Of course, if ln(y) — 1 as ft — 0+, then y ^ e 1 = e a,s h ^ 0 + . This 
proves the above limit. The key point is that once you know that the derivative 
with respect to x of ln(a:) is 1/x, then you’re golden: everything else is easy. 

Differentiation and Limits' 


In this section, we’ll prove some results involving derivatives and limits. More 
specifically, we’ll deal with differentiating constant multiples of functions, 
sums, and differences of functions, and the product, quotient, and chain rules; 
then we’ll prove the Extreme Value Theorem, Rolle’s Theorem, the Mean 
Value Theorem, and the formula for the error term in linearization. We’ll 
finish off by looking at derivatives of piecewise-defined functions and a proof 
of rHopitaPs Rule. 


A,6,l tJonstofet tyiultiples of functions 


Suppose y is a differentiable function of x and c is some constant. We want 
to show that 

X 


It’s pretty easy. Define f by y = f(x); then the left-hand side of the above 
equation is 

lim Cf(x + Ax) - cf(x) 

△x—o Ax 

All you have to do is take out a factor of c from the numerator and drag it 
out of the limit. This was justified at the end of Section A.2.2 above: 


lim Cf(x + Ax) - cf(x) = lim c(f(x + AaQ - f{x)) 
^ — >0 Ax Ax— Ax 


c lim 


f(x + Ax) - f(x) 
Ax 


The right-hand side is just which is the same thing as c(dy/dx), and 

we’re all done. 


A.6.2 Sum^afid'difFer^ces of functions 

If u and v are differentiable functions of x, we’d like to show that 

d , 、 du dv 

^ {u+v) = d^ + ^ 
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and similarly with both plus signs replaced by minus signs. There’s almost 
nothing to this. liu = f(x) and v = g(x), then the left-hand side of the above 
equation is 

lim Ax) + g{x-\- Ax) - (f(x) + g{x)) 

△x—o Ax 

All you have to do is rearrange the sum and split up the limit, which was 
justified in Section A.2.1 above, to see that the above limit is equal to 

lim f(x + Ax)-f(x) | Um g{x + Ax) - g(x) 

Aa:—>0 Ax Aa:—^0 Ax 

But this is just /’($)+ g f (x), which equals the right-hand side of the equation 
we’re trying to prove. The situation with minus signs instead of plus signs is 
just as easy! 


A.6.3 Proof product rufe 

For the proofs of the product and quotient rules, well stick with the dy/dx 
rather than f r (x) notation, as it’s easier to understand the concepts using the 
former version. As we saw in Section 5.2.7, we have 


dy_ 

dx 




with the understanding that Ay is the amount y changes when you move x 
to a: + Ax. 

So we want to prove the product rule, which says that 


d_ 

dx 


t \ du 
(uv) = v— + u 


dv 

dx 


Suppose we change a; to a: + Ax. Then u changes to u-\- Aw, and v changes 
to v + Av. This means that uv changes to (u + △w)(r + Av). How much of a 
change is this? Take the difference between the old and new quantities to see 
that 

A(m;) = (u-\- + Ar) _ uv. 

Expanding and canceling, we end up with 

A(uv) = vAu + uAv + AuAv. 


Now divide this equation by Ax. In the case of the last term, we’ll even divide 
by an extra Ax, but then multiply by it once more to make things balance. 
We end up with 


^1= V ^+U^ 

Ax Ax Ax 


An. Av 



If you take limits as Ax ^ 0, then all the ratios go to the corresponding 
derivatives, but the final factor of Ax goes to 0: 



d , 、 du dv du dv 

— (uv) = v— + u— + —— x 0. 
ax ax ax ax ax 

Since the last term is 0, we have proved the product rule. Now you should 
try writing out a proof using the f(x) notation (version 1) instead. 
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A.6.4 Proof of the quotient rule 

Now we want to show that 

du dv 
d (us v Tx~ U ^ 
dx\v) v 2 

Again, when x changes to a: + Ax, we know that u and v change to u-\- Au 
and v-\- Av, respectively. This means that u/v changes to (u + Au) / {v-\- A-u). 
The amount of change is 

^ 纟。 u + Au u 

\vJ v + Av v 


Taking a common denominator and canceling uv — uv leads to 
A /u\ _ vAu — uAv 
\vJ v 2 + vAv 

Dividing this by Ax, and then multiplying and dividing the Av term in the 
denominator by Ax, gives 



Ax 


Au 

V A^' 


Av 

l A^ 


Ax 


Now let Aa: —> 0. All fractions become derivatives, and the final factor on the 
bottom goes to 0, so we end up with 


d_ 

dx 


0 


du dv 

v n 

v 2 x 0 

dx 


Since the final term in the denominator is just 0, we have proved the quotient 
rule. 


A.6.5 Proof of the chain rule 

Suppose that y is a differentiable function of u, which is itself a differentiable 
function of x. We want to prove that 

dy dy du 

dx du dx 

At first glance there’s nothing to this using the A notation — you just write 
Ay Ay Au 

Ax Au Ax 

and take limits. Unfortunately, Au might sometimes be 0, which would in¬ 
validate the whole equation. So let’s use the function notation. Let / and g 
be differentiable, and set h(x) = We want to show that 

h\x) = f'{g{x))g'(x). 
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If g is constant near x, then so is /i, so both sides of this equation are 0. 
Otherwise, we know that 


m h( X + Ax)-h(x) = ^ f(g(x + A x ))-m x )) 

Ax^O Ax Ax-^0 Ax 

Multiply and divide the fraction by g(x + △$) — g(x), which must be nonzero 
for infinitely many values of Ax near 0, then split up the limit to get 

h'(x) = lim mx + Ax)) -f(g( X )) x ^ g(x + Ax) - g(x) 

Ax^o g(x H- Ax) — g(x) o Ax 

The right-hand limit is just g f (x), but how about the left-hand one? The trick 
is to set € = g(x-\- Ax) — g(x). Then the quantity g(x-\- Ax) in the numerator 
of the left-hand limit can be written as g(x) + e (can you see why?), whereas 
the denominator is just e itself. So we have 

^) = A lim o 删 +1 - 綱 ) x 〆⑷. 


Now what happens to e when Ax — 0? Since g is differentiable, we know 
from Section 5.2.11 that g is continuous. In particular, 


lim g(x + Ax) = g(x). 

Ax^O 

If you subtract g(x) from both sides, you see that e — 0 when Ax — 0. This 
means that in our expression for h\x), we can replace the Aa: — 0 by e — 0 
and get 

^) = lim 删 +1 - 腳 ) x Ax) . 

Now the first term is exactly f’(g(x))，so h r (x) = f f {g(x))g(x) and we have 
proved the chain rule. 


A.6.6 Pr<^)fOf1^'B^r#nn0 物 fpp-Ihoc^rrt 

In Section 11.1.2 of Chapter 11, we stated the Extreme Value Theorem. This 
says that if a: = c is a local maximum or minimum for a function /, then 
x = c is a, critical point for /. This means that either f f (c) doesn’t exist, or 
f(c) = o. 

To prove this, let’s first suppose that x = c is a local minimum for /. If 
f r (c) doesn’t exist, then it’s a critical point, which is exactly what we were 
hoping for. On the other hand, if /’(c) exists, then 

^=^o f{c+ T m - 


Since c is a local minimum, we know that /(c + /i) > /(c) when c + ft is very 
close to c. Of course, c + /i is close to c exactly when h is close to 0. For such 
/i, the numerator f(c h) — /(c) in the above fraction must be nonnegative. 
When h > 0, the quantity 

/(c + /i)-/(c) 


h 
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A.6.7 


A.6.8 


is positive (or 0 )， but when h < 0, the quantity is negative (or 0). So the 
right-hand limit 

/(c + ") - /(c) 

h 

must be greater than or equal to 0, while the same left-hand limit is less than 
or equal to 0. Since the two-sided limit exists, the left-hand and right-hand 
limits are equal; the only possibility is that they are both 0. This shows that 
f(c) = 0, so x = c is once again a critical point for /. 

How about if a; = c is a local maximum? I leave it to you to repeat the 
argument. The only difference is that the quantity /(c h) — /(c) is now 
negative (or 0) when h is close to 0. 

Proof of %lte%-jh©QT©m 

Suppose / is continuous on [a, 6], differentiable on (a, 6), and satisfies the 
condition f(a) = f(b). Then we want to show that there is a number c in 
(a, 6) such that /’(c) = 0. To do this, we use the Max-Min Theorem to 
say that / has a global maximum and a global minimum in [a, 6]. If either 
the maximum or the minimum occurs at some number c in (a, 6), then the 
Extreme Value Theorem says that /’(c) = 0. (We know that /’(c) exists 
since / is differentiable in (a, b).) The only other possibility is that the global 
maximum and the global minimum both occur at the endpoints a and b. In 
that case, since f(a) = /(&), the function must be constant, so every number 
c in (a, b) satisfies / r (c) = 0. That’s all there is to the proof! 


Proof of the Mean Value Theorem 


Now we have / which is continuous on [a, 6] and differentiable on (a, b), but 
we don’t assume that f(a) = /(&). The Mean Value Theorem says that there 
is some c in (a, b) with 


f(c)= 


b — a 


To prove this, define a new function g by the equation 


g{x) = f(x) - 


m _ f(g) 

b — a 


(x — a). 


It looks a little complicated, but actually all we are doing is subtracting a 
constant multiple of the linear function (x — a) from f(x) and calling it g. 
So the function g is also continuous on [a, b] and differentiable on (a, 6), and 
what’s more, we have 

g(a) = f(a) - ~^ - (a-a) = f(a) and 

9(b) = f(b)~ m b Z f a {a \b-a) = f(a). 


So we have shown that g(a) = g(b), which means we can apply Rolle’s Theo¬ 
rem! We end up with a number c such that g f (c) = 0. Now we just have to 
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differentiate g and see what that means for /. Since the quantities f(b) — f(a) 
and b — a are constant, we get 


9 \x) = f’(x) 


m-m 


Now plug in x = c. Since g r (c) = 0, we have 


0 = f\c) ■ 


/ ⑻ _ / ⑷ 


This means that 


f ， (c) = 


f(b) - f(a) 


which is exactly what we wanted to show! 


A.6.9 The error in linearization 

Let’s tie up another loose end. In Section 13.2 of Chapter 13, we looked at 
the linearization L of a function / about x = a, where a is some number in 
the domain of /: 

L{pc) = /(a) + f\a)(x-a). 

If x is near a, we can use L(x) to estimate the value of f(x). How wrong could 
we possibly be? According to the formula in Section 13.2.4 of Chapter 13, if 
/" exists between x and a, then 

|error| = ^\f"(c)\\x - a\ 2 ; 

here c is some number between x and a. Let’s prove this formula. Start off by 
calling the error term r(x); since r(x) is the difference between the true value 
f(x) and our guess, which is the linearization L(x) = f(a) + / 7 (a)(a: — a), we 
have 

r{x) = f(x) - L{x) = f{x) - f{a) - f{a)(x-a). 

Now, the clever idea is to fix x as a constant and let a be the variable. Inspired 
by this, let 

g(t) = f{x) - f{t) - f'(t)(x-t). 

So the error r(x) arises exactly when t = a. That is, the error is g(a). Note 
that 

g{x) = f(x) - f{x) - f'(x)(x -x) = 0. 

Let’s differentiate g with respect to t. The term f(x) is constant, so its 
derivative is 0. Also, we need the product rule to deal with — t). All 

in all, we get 

g'(t)=0-f'(t)- (/' ⑷ X (-1) + = -f"(t)(X-t). 


In particular, we have 


g\x) = -f f, (x)(x -x) = 0. 
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Everything we’ve done so far makes a lot of sense. Now we have to do some¬ 
thing that seems to be a little whacked out. Remember that we want to show 
that the error is \f n (c){x — a) 2 , where c is between x and a. Since the error 
is g(a), this suggests that g(t) is something like K(x — t) 2 , where K is some 
number which does not depend on but only on x and a. Even this isn’t 
exactly true, but it might explain why we’re going to let 

h(t) = g(t) — K(pc — t) 2 . 


You see, when you differentiate this with respect to t, holding x constant, you 
get 

h\t)=g , {t) + 2K(x-t). 

So what? Well, we can use the Mean Value Theorem (see Section 11.3 of 
Chapter 11) to get 

hjx)-h(a) 
x — a 

for some c between x and a. We can substitute for h f (c), h(x), and h(a) using 
the above equations: 


g , {c)^2K{x-c) = 


( ff (x) — K(X — x) 2 ) - (ff(a) - K(x- a) 2 ) 
x — a 


-g(a) + K(x - a) 2 
x — a 


since g(x) = 0. Since g f (c) = —/’’(c)(a: — c)，this last equation can be rear¬ 
ranged to 

g(a) — K(x— a) 2 = (x — a)(x — c)(/’’(c) — 2K). 

We’re close, but there’s still a problem. We can’t handle the factor {pc — c), 
since that’s nowhere to be found in our error term! The only way we can get 
rid of it is if the left-hand side is actually 0. That is, we should have chosen 
K such that g(a) — K(x — a) 2 = 0. Indeed, if X = g(a)/(x — a) 2 , then the 
above equation becomes 


0 = ($- a)(x - c) (/"(e) 一 


2咖） 、 


Since x ^ a and x ^ c, we must have 


rw- 


gg(g) 

{x - a) 2 


= 0, 


which means that g(a) = \f n (c){x — a) 2 . Since g(a) = r(x) is the error we’re 
looking for, we’re finished. 


Derivatives of piecewise-definad functions 

Imagine that / is defined in piecewise fashion as 


/㈤ = 


/i(^) 

/ 2 ㈤ 


if a: > a, 
if a: < a. 
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(You could change x > a to x > a, and a: < a to a: < a; it doesn’t make a 
difference.) Anyway, in Section 6.6 of Chapter 6, we considered the question 
of whether / is differentiable at a. We have assumed that if the functions /i 
and /2 match at x = a, and also the derivatives f[ and match at a; = a, 
then / is differentiable at a. How can we justify this? Well, first note that 
the matching of /i and /2 at a; = a means that 

lim /iOr) = lim f 2 (x) = f(a). 

x—>a+ x—^a~ 

This ensures that / is at least continuous. Now we are also assuming that the 
derivatives match: this means that /i is differentiable to the immediate right 
of a, /2 is differentiable to the immediate left of a, and 

lim /{(a;)= lim f^(x) = L, 

x—*a + x—^a~ 

where L is some nice finite number. So consider the quantity 

/(a + /i) - /(a) 
h 

for some small number h ^ 0. If ft > 0, then we can apply the Mean Value 
Theorem (see Section 11.3 of Chapter 11) to say that 

f(a+h l~ f{a) =fm, 

where c is some number between a and a-\-h. (Here we needed the continuity 
of / on [a, a h].) By the sandwich principle, as /i — 0+, the number c is 
sandwiched between a and a + ft, so c —> a + as /i — 0+ • We now see that 

A f(U + h l~ f(a) =nmf ； (c)=Umf^^L. 

The left-hand limit works the same way, except that we use instead of f[ 
to see that 

lim /( a + h ) ~ / ⑷ = lim f^ c )= lim f^ c ) = L. 

h—^0~ h h—^0~ c-^a~ 

The left-hand and right-hand limits are both equal to L, so we have shown 
that /’ ⑷ exists and is also equal to L. 

A.6.11 Proof 's Rul 笔 

Let’s prove l’H6pital’s Rule (see Chapter 14). Specifically, suppose we 
have two functions / and g which are both differentiable on some interval 
containing a point a (but maybe not at a itself); and f(a) = g(a) = 0; and 
also g ; (x) 0 except maybe at a itself. Then we need to show that 

r f(x) _ .. f(x) 
lim —— = lim , 
x—^a g(x) x—^a g r {x) 


provided that the limit on the right-hand side exists. We’ll need a slightly 
different version of the Mean Value Theorem, called Cauchy’s Mean Value 
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Theorem: if / and g are continuous on [A, B] and differentiable on (^4, B), 
and also g\x) ^0 on (A, B), then there is some C in (A, B) such that 

f(C) _ f(B) - f(A) 

9 f (C) ~ 9 (B)-g(A)' 

Let’s prove this first, then use it to prove PHopitaPs Rule. Incidentally, note 
that if g(x) = x for all x, then g f (x) = 1 and the above equation becomes 


This is just the regular Mean Value Theorem! That doesn’t really help us, 
though. Let’s go back to the original equation above and look at the denom¬ 
inator on the right-hand side, which is g(B) — g(A). That can’t be equal to 
0; if it were, then g{A) = g[B), meaning that g\C) = 0 for some C in (A, B) 
by Rolle’s Theorem (see Section 11 in Chapter 11.2). So the right-hand side 
makes sense. Now, define a new function h by 

峪)=彻)-(黑^) 9 ⑷ 


for all x in (A, B). (Compare this with the function we called g in the proof 
of the ordinary Mean Value Theorem in Section A.6.8 above.) Anyway, let’s 
write down some nice facts about this function. First, let’s calculate h(A) and 
h(B). We have 


h(A) 




_ f(A)g(A) - f{B)g{A) + f(A)g(A) 

~ 9(B)- 9(A) 

f(A)g(B) - f(B)g(A) 

~ g{B)-g{A) ’ 


whereas 


_ mm-f(B)g(A) - f{B)g{B) + f(A)g(B) 
9(B)-9(A) 

_ f(A)g(B) - f(B)g(A) 

~ 9 (B)- g(A) ‘ 


So h(A) = h(B). Also, note that h is differentiable, and since A and B are 
constant, we have 


h'{x) = f(x) - 




) ㈤ • 


We can use Rolle’s Theorem, since h(A) = h(B), to conclude that there’s a 
number C in (A, B) such that h r (C) = 0. This means that 

印) = 聲(鵠 H ). 0 _ 
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If you rearrange this equation, you get the one we want: 

f(C) _ f(B) - f(A) 

9 f (C) ~ g(B)-g(A). 

Now we’re ready to prove PHopitaPs Rule. Since f(a) = g(a) = 0, we have 
a—a g(x) x^a g(x) - g(a) 


If x > a, then we can use Cauchy’s Mean Value Theorem (which we just 
proved) on the interval [a^x] to say that 



v f(x) _ /㈤ -/ ⑷ f(c) 

X^a ~^) ~ X™ g (x) - g(a) ~ ~^{6) 

for some c in (a, x). Otherwise, if a: < a, then the same thing is true but c 
is in (x, a). (Note that we’ve used the fact that g' isn’t 0, except possibly at 
a; that’s one of the conditions of Cauchy’s Mean Value Theorem.) Of course, 
the number c depends on what x is; but we can see that as a: — a，also c — a. 
So we have 

lim M = lim rn = lim /M. 

x ^ a 9 {x) x^a g f (c) c^a g’(c) 


All that’s left is to treat c as the dummy variable and change it to x, and 
PHopitaPs Rule is proved! 

Well, sort of. We still haven’t proved the oo/oo case, nor the case when 
a: — oo (or —oo). It’s a great exercise to try to adapt the above proof to these 
cases, if you dare. 


A.-7 Proof of the Taylor Approximation Theorem 

Now let’s look at how to prove the Taylor approximation theorem from Sec¬ 
tion 24.1.3 in Chapter 24. Here’s what the theorem says: if / is smooth at 
x = a, then of all the polynomials of degree N or less, the one which best 
approximates f(x) for x near a is the iVth-order Taylor polynomial Pn, which 
is given by 

Pn(x) = /(a) + f(a)(x - a) + 

+ ^i X -af + ... + q^ {x -ar. 

The plan is to show how this theorem follows from the full Taylor theorem, 
which we looked at in Section 24.1.4 of Chapter 24. I’m omitting the proof 
of the full Taylor theorem because you can find it in most textbooks or even 
by typing “proof of Taylor’s Theorem” into a search engine. What you won’t 
find as easily is the proof of the approximation theorem, so let’s look at it 
now. 

Let’s first simplify matters by setting a = 0. Since we’re assuming the 
full Taylor Theorem has been proved, we know that f{x) = Pn{x) + Rn(x), 









where 
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p N (-) = E^r lxn 

n=0 


is a polynomial of degree N, and 


Rn ^ = WTW xN+1 


for some c between 0 and x. (Remember, we have set a = 0, so factors like 
(x — a) n just become x n and quantities like (a) become /( n )(0).) What 
we want to show is this: 

of all polynomials of degree N or less, Pn gives the best 

approximation to / near 0. 

How on earth do you go about showing something like that? What does 
“best” even mean in this context, anyway? The trick is to pick some other 
polynomial of degree no more than N\ let’s call it Q. Since Q is different 
from Pn, we know that Q has at least one coefficient which differs from the 
corresponding coefficient in Pn. We want to show that Pn{x) is closer to f(x) 
than Q(x) is, at least when x is close to 0. To see how close two quantities 
are, you look at the difference between the quantities. So what we really want 
to show is the following inequality: 


I/Or) - PivWI < \ f(x) - Q{x)\ 


when x is close to 0. If this is true, then you can conclude that Pn{x) is 
indeed closer to the ideal value f(x) than Q(x) is. 

To get at our desired inequality above, let’s look at both sides individually. 
The left-hand side is the absolute value of f{x) — Pn(x), which is actually the 
remainder term Rn(x). We have an expression for Rn[x) above; it has three 
factors, which are /^ +1 ^(c), x N+1 , and l/(iV+l)!. We know that c is trapped 
between 0 and X] asx ^ 0, by the sandwich principle we must also have c — 0. 
Since we are assuming / is very smooth, the function /( 料 1 ) is continuous. 
So, as a: — 0, we have c — 0, so it follows that /(# +1 )( 0 ) 〜 /(# +1 )(0). 
Putting the three factors together and taking absolute values, we have 


\f(x)-P N {x)\ = 


|i?iV ⑷ I = 






as a: — 0. Actually, we can let C = /(■^+” ⑼八 # + 1)! and notice that C is 
just some constant which doesn’t depend on x. So we have 

\f(x) - Pjv ⑷ I 〜 \C\\x\ N+1 as a — 0. 


Great. Now let’s look at the right-hand side of the inequality we’re trying to 
prove. This is the quantity |/(a:) — Q(x)\. Let’s write f(x) = Pn{x) + Rn(x), 
so that 


I/O) — Q(x)\ = \.Pn^H R n (x) - Q{x)\ = \S(x) + 如她 










where we have lumped together Pn with Q by setting S(x) = Pn(x) — Q(x). 
Let’s take a closer look at S. It is the difference between two polynomials 
of degree no more than N which are not the same polynomial. So S is a 
polynomial of degree less than or equal to AT, but it’s not the zero polynomial. 
Let’s suppose that if you write out S(x) in powers of x, it looks something 
like this: 

S(x) = a m x m + …， 

where a m a; m is the lowest-degree term. The number m has to be between 0 
and N, since S has degree less than or equal to N. We know that S behaves 
like its lowest-degree term (see Section 21.4.1 of Chapter 21 for a discussion 
of this). That is, S ⑷〜 a m x m as a: — 0. On the other hand, we need to look 
at S(x) + Rn{x) since that is the right-hand side of our desired inequality. 
We have already seen that Rn(x) 〜 Cx N+1 as $ — 0, so the lowest-degree 
term in S(x) + Rn{x) still looks like a m x m (remember, m < N so x 171 is a 
lower-degree term than x N+1 ). So, all up, we have 

\f(x) - Q(x)\ = |S0r) +i?jvOr)| 〜 |a m ||x m | as a; — 0. 

Great — we want to prove that the inequality 

\f( x )-P N ( x )\<\f( x )-Q( x )\ 

is true when x is near 0. We know that \f(x) — _Piv(a:)| 〜 |(7| 卜 |# +1 and 
\f(x) — <9(a:)I 〜 |a m ||a:| m as a: — 0. Since m < N 1 (and \C\ and |a m | are 
constant), it is easy to see that the quantity |(7|| 怎 |# +1 is much smaller than 
|a m ||x| m when x is small. Indeed, the ratio of the two quantities is 


|gikr +i 

wi 中 


cm 


N+ 





where Ci = |C|/|a m | is just another constant. The right-hand quantity goes 
to 0 as o: — 0. So, the above inequality is indeed true when x is close to 0 
and we have finally proved our Taylor approximation theorem! 

Actually, there is one little point we didn’t cover: we assumed that a = 0. 
To get from this situation to the general situation, all you have to do is replace 
the quantity x by the translated quantity (x — a) everywhere you see it in the 
above argument. The only thing you have to note is that (x — a) — 0 is the 
same thing as x ^ a. I leave it to you to fill in the details. Well done if you 
made it through the above proof. 









APPENDIX It 


^timating Integrals 

Most of the time when we’ve looked at definite integrals, we’ve been used to 
giving an exact answer by using antiderivatives and the Second Fundamental 
Theorem. In real life, alas, finding antiderivatives in a useful form can be 
difficult or impossible. Sometimes the best you can do is find an approxi¬ 
mation to the value of your integral. So we’ll look at three techniques for 
estimating definite integrals: strips, the trapezoidal rule, and Simpson’s rule. 
In summary, here’s the plan for this final appendix: 

• estimating definite integrals using strips, the trapezoidal rule, and Simp- 
son’s rule; and 

• estimating the error in the above approximations. 

B.l Estimating Integrals Using Strips 

Here’s a perfectly reasonable definite integral: 

e~ x2 dx. 

It corresponds to the area of the region bounded by the x-axis, the curve 
y = e~ x , and the lines a; = 0 and x = 2, like this: 



Finding an area like this might seem a little technical, but it’s actually in¬ 
credibly useful. The above curve is commonly known as a bell-shaped curve,* 


* Technically the bell-shaped curve, or normal distribution, is actually given by the 
equation y = e _a;2 / 2 /\/27r- 
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and it is fundamental in the study of probability theory. So it’s especially 
annoying that there’s no nice, simple way to write down the antiderivative 

卜 x2 dx . 

Actually, you can use Maclaurin series to express this integral as an infinite 
series, but that’s not so nice or simple. The cold, hard reality of the situation 
is that there’s no way to write down the exact value of the definite integral at 
the beginning of this section in a simple, closed form. (We already discussed 
this point in Section 16.5.1 of Chapter 16.) 

On the other hand, we can find an approximate value for the integral — 
an estimate, if you prefer 一 by using the definition of the Riemann integral. 
Indeed, in Section 16.2 of Chapter 16, we looked at partitions, meshes, and 
Riemann sums. Since the integral is the limit of Riemann sums, we can get 
an approximation simply by not taking the limit. So, to estimate the integral 

f f(x) dx, 

J a 

you can chop up the interval [a, b] into a partition of the form 
a = < x n -i < x n = b, 

then choose a point c± in [a ： o,^i], a point C 2 in [a ： i,^ 2 ], and so on until you 
choose c n in [x n -i,x n ]. At that point, you’re ready to write 

nb ^ 

I f( x ) f{cj)(xj 



This just says that the integral is approximately equal to one of its Riemann 
sums. 

It all seems pretty abstract. Let’s see how it works in the case of our above 
example. We’re integrating from 0 to 2, so we need a partition of the integral 
[0,2]. The simplest partition of that interval is just the interval [0,2]，which 
corresponds to the choices n = 1, $0 = 0, and x\ = 2. We just need to pick c\ 
inside [0,2]. The approximation we’ll end up with depends a lot on this choice! 
For example, if you choose c\ = 0, C\ = 1, or c\ =2, then your approximations 
will end up being the areas of the following regions, respectively: 




The first one is clearly a huge overestimate, while the third one is an under¬ 
estimate. The middle isn’t so bad, but still not perfect. In order to work out 
the values of these three estimates, we’ll use the formula: 


p2 2 n 









Section B.1.1: Evenly spaced partitions • 705 


Replacing n by 1, /(ci) by e _c i, xo by 0, and x\ by 2, we get 

Z* 2 2 2 2 

/ e? dx 兰 e _c " (2 - 0 ) = 2 e _c? . 



When ci is 0, 1, and 2, these values are 2, 2/e = 0.736, and 2/e 4 = 0.037, 
respectively. As you can see, there’s a lot of difference between these three 
estimates! 

Now let’s see if we can do better by using more strips. Suppose we take a 
five-strip partition of [ 0 , 2 ] that looks like this: 


0 <^< 1 <|<|< 2 . 

So n = 5, and xq = 0, = 1, xs = X 4 = |, X 5 = 2. Suppose 

that we choose our numbers Cj to be at the left-hand endpoint of each little 
interval. This means that C\ = 0, C 2 = C 3 = 1, C 4 = |, and C 5 = |. 
Plugging everything into the above approximation formula, we have 


/ 〆 dx = ^2 f ⑹ ( x j _ x J-i) 

j=i 

5 

= ⑻ -Xj-!) 

j=i 

= e -o 2 ( i_o) + e -(i/^( 1 _i) + e -i 2 ( |_ 1) 

+ e -(5/4) 2 ( |_| ) + e -(3/2) 2 (2 _| ) _ 



This can be simplified a little if you like, or you can just use a calculator 
or computer to see that it is equal to 1.0865 to four decimal places. Now, I 
leave it to you to write down the value of the estimate that we would have 
gotten had we used the right-hand endpoint of each little interval instead of 
the left-hand one. 


B.l .1 Evenly spaced partitions 

It’s often convenient to take your partition to be evenly spaced. This means 
that each of the little intervals has the same width, and it’s not too hard 
to work out what that width is. If the interval of integration is [a, b], then 
its length is b — a units; so if you chop up the interval into n equal pieces, 
then each piece has length (b — a)/n units. We’ll call this quantity h\ so 
h = (b — a)/n. Furthermore, the expression (xj — Xj-i), which appears in the 
definition of the Riemann sum, is just the width of the jith strip, so this is 
exactly h as well. Our expression 

- Xj-!) 


" X 亡 /( C J+)_ 

j=i 


can now be simplified to 
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You still need to choose the numbers Cj, but things are a lot simpler. For 
example, let’s estimate our integral 


e-^ dx 


using 10 strips of equal width. The width of each strip is h = (2 — 0)/10, or 
1/5, and n = 10, so we have 


* n 1 10 

e~ x2 dx = hx ^ f(cj) = - ^ e - c ,. 


The intervals all have width so starting from 0, we see that they are par¬ 
titioned as follows: 




5 \ 5 \ 5 \ 5 


5 \ 5 \ 5 \ 5 


If we let Cj be at the right-hand endpoint in each case, then we’ll have Ci = 

C 2 = I，and so on up to cio = 2. Plugging these numbers into the above 
formula, we have 


''e~ x2 +e~^ 


( 9 / 5) 2 


There are ten terms in the sum. Since our function / is decreasing between 
0 and 2, and we’ve used the right-hand endpoint for each strip, the above 
estimate is an underestimate. (Can you see why?) In any case, you can use a 
calculator or computer to find that the above sum is approximately 0.783670 
(to six decimal places). 

Now, what if you wanted to use the midpoint of each interval, rather than 
the left-hand or right-hand boundary? Well, the midpoint of [0, *] is 士 ， the 
midpoint of [* ， 鲁 ] is 聶 ， and so on. So another possible approximation is given 
by 


dx = 




1 / 10) 2 , p -( 3 / 10) 2 


+ e -( 17 / 10) 2 


， - ( 19 / 10 ) 2 > 


This is approximately 0.882202. 


B.2 Ih#Trapezoidal l?ule 

There’s quite a bit of a burden involved in picking the numbers Cj. Most of the 
time, people choose either the left-hand endpoint or the right-hand endpoint, 
but the midpoint is also a common (and reasonable) choice. Here’s another 
method for estimating integrals that removes the element of choice (once you 
decide to use the method, of course!) while giving even better estimates. It’s 
called the trapezoidal rule. 

The idea is very simple: allow the tops of the strips to be nonparallel 
to the base. The top of each strip will be the line segment joining the two 
corresponding points on the curve y = f(x). Here’s a picture illustrating the 
difference in the two approaches: 









Estimating 


This means that we can collect a lot of the terms together. In particular, 
except for xo and x n , every term of the form f(xj) needs to be counted twice. 
For example, if n = 4, we have 

j f(x) dx ~\ + + (/(a ； i) + f{x 2 )) 

+ f{X 2 ) + f{x 3 )) + {f{x 3 ) + f(X 4 ))). 

So we can group all but the first and last terms in the sum into pairs to get 
J f{x) dx ~^ (f( x o) + 2/(a ； i) + 2 f(x 2 ) + 2/(a ； 3) + /(a ： 4)) • 


The same trick works in general; we end up with: 



We’ll take n = 5. Since [0,2] has length 2 units, the width of each strip is 
therefore /i = | units, and the partition is 



According to the trapezoidal rule, we have 



If you want, you can simplify the right-hand side to 


1 (_2e- 4 / 25 + 2e- 16 / 25 + 2 e _ 36 / 25 + 2e_ 64 / 25 + e_ 4 ) • 

You could also use a calculator or computer to see that the above number 
is 0.881131 to six decimal places. This is somewhat less than the estimate 
1.0865 which we found at the end of Section B.l above, but quite close to the 
estimate 0.882202 from the end of Section B.1.1. 
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B.3 Simpson's Rule 

Why stop at trapezoids? They still have a clunky linear top. We can do 
better by using a curve at the top of the strip, rather than a line segment. 
Here’s how it’s done. Start by looking at a pair of adjacent strips, but instead 
of connecting the tops with two line segments, use a quadratic curve, like this: 


会 (/ ㈣ +4/( 灼) +/ ⑹ :) 

square units, where we have once again set h = (b—a)/n. Now if we repeat this 
for each pair of strips and add up all the areas, we’ll get our approximation. 
As in the case of the trapezoidal rule, adjacent pairs of strips share an edge, 
so there’s some doubling up. For example, if there are four strips, then the 
total would be 

I + 4/(a ； i) + f{x 2 )) + {/(%} + 4/(a; 3 ) + f(x 4 ))l; 

the two terms of the form /( 抑 ） combine to give 2 /( 怎 2 )， so the total is 

I (fM + 4/Oi) + 2f(x 2 ) + 4/(x 3 ) + f(x 4 )) ■ 

The same pattern persists with more strips, so that the coefficient of f(xj) is 
equal to 2 if j is even and 4 if j is odd — except for f(xo) and f(x n ), which 
both have coefficient 1. All in all, we have: 

Simpson’s rule: if n is even, xq < x\ < ... < is an evenly 
spaced partition of [a, 6], and h = (b — a)/n, then 

J /{4 dx -\:(/(_) + 4 /Oi) + 2 f(x 2 ) + 4/0 3 ) 

••• + 2f(x n - 2 ) + 4/(a;„_i) + /(〜)）• 

Compare this to the trapezoidal rule from the previous section. Instead of 
coefficients which look like 1,2,2,.... 2,2,1, this time the coefficients follow 
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the pattern 1 ， 4, 2,4,2, .. • ， 2 , 4,2, 4,1. Also note that the denominator of the 
constant out front is 3 instead of 2. 

Simpson’s rule is pretty easy to apply. Let’s go back to our old example 
of 2 

f e - x2 dx ， 


and use Simpson’s rule with n = 8. (We can’t do n = 5, since n has to be 
even in order to use Simpson’s rule.) The length of each strip is /i = (2 — 0)/8 
units, which is so the partition is 

0<S<!<f<l<f< 暑 <|<2. 

By the above formula, we have 


[e~ x2 (e~° 2 + 4e _(1/4 ) 2 + 2e _(1/2 ) 2 + 4e _(3/4)2 + 2e~^ 

f 0 3 V 

+ 4e _(5/4)2 + 2e _(3/2)2 + 4e _(7/4)2 + e -2 '). 


This is approximately 0.882066, according to my calculator. This is quite 
close to our estimate from the previous section; specifically, when we used the 
trapezoidal rule with n = 5, we got the estimate 0.881131. For the record, my 
computer program says that the correct value of the integral is 0.882081 to six 
decimal places, so Simpson’s rule (with n = 8) is better than the trapezoidal 
^ rule (with n = 5). Of course, a fairer comparison would be to use n = 8 in 
both cases; I leave it to you to repeat the trapezoidal rule computation in this 
case and compare the result with the corresponding Simpson’s rule estimate 
which we just found. 


B.3.1 Proof of Simpson's rule 

Let’s translate the picture over so that the middle line lies along the y-axis, 
like this: 



As you can see, this shifts the ^-coordinates of the partition endpoints over 
to —h,0, and h. Instead of writing f(xo), /($i), and /( 奶）， let’s just write 




Section B.4: The Error in Our Approximations • 711 


P, Q, and R, respectively. The top points are connected by some quadratic, 
but we don’t have a clue what it is. Well, let’s call it g and suppose that 
g(x) = Ax 2 + Bx + C. We know that P = g(—h)，Q = g(0) and R = g(h)' 
this means that 

P = A{-hf + B{-h) + C, 

Q = a(o ) 2 + b(o) + c, 

R = Ah 2 + Bh + C. 

The middle equation says that C = Q\ then you can rearrange the other two 
equations to see that A = (P R — 2Q) / (2h 2 ) . (We don’t need to know what 
B is!) Now, the shaded area we want is given by 

J {Ax 2 + Bx -\-C)dx = (.x 3 + 鲁尤 2 + 仏) I = —y— + 2C/i 

square units, after simplifying. Substituting the values of A and C from above, 
the above expression reduces to 

¥ x P+ 2 h 2 2Q +^Qh = \{P + ^Q + R). 

Now all we have to do is translate to a more general position (which doesn’t 
affect the area) and replace P, Q, and R by the function values /(xo), 
and /($ 2 ), respectively, to get the prototype formula from the beginning of 
the previous section. 

The Error in Our Approximatiofis 

The whole point of making an approximation (or estimate, if you prefer that 
word), is that you end up with something close to the exact quantity you’re 
looking for. If you could actually pin your finger on the exact answer, you’d do 
that, but sometimes it’s just too difficult. So an approximation at least gives 
you a number close to the exact one. As we’ve seen a number of times, most 
notably when we looked at linearization and also Taylor series (see Section 13.2 
of Chapter 13 and Section 25.3 of Chapter 25 )， there’s one other important 
consideration: how good is the approximation? Is your approximate answer 
at least close to the actual one, or does it suck? 

To quantify this, we look once again at the error in the approximation, 
which is the difference between the actual quantity and the approximation 
itself. So suppose that we use one of the above techniques — evenly spaced 
strips, the trapezoidal rule or Simpson’s rule — to approximate the integral 
/(^)虹 We’d get something like 

[f(x) dx = A, 

J a 

where A is our approximate value. The absolute value of the error is then 
given by 
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It turns out that we can get some idea of how big the error could possibly be 
by using the derivatives of /, if they exist. In that case, we can let Mi be the 
maximum value of \ f f (x)\ on [a, b]. Similarly, let M 2 be the maximum value 
of \f n {x)\ on [a, 6 ], and finally let M 4 be the maximum value of |/( 4 )($)| on 
[a, b ]. Then one can show the following bounds on the error term, depending 
on the method used: 


for evenly spaced strips, 
for the trapezoidal rule, 
for Simpson’s rule, 


I error I < ^Mi (6 — a)h, 
|error| < ^M 2 ( 6 -a)/i 2 , 

I error I < ——M 4(6 — a)h 4 . 


Here h is the strip width (b — a)/n, as usual. Although the above formulas 
are pretty similar, there are some differences. First, the coefficients out front 
are different. Second, different derivatives are involved: for strips, the first 
derivative comes up (in the form of Mi); for the trapezoidal rule, it’s the 
second derivative; while for Simpson’s rule, it’s the fourth derivative. The 
most significant difference, however, is the power of h which appears. This 
shows how much the error decreases as the strip width becomes smaller, which 
of course happens when you take more strips. As h becomes small, /i 4 gets 
smaller much more quickly than /i 2 or ft, so Simpson’s rule should kick some 
serious butt over the other methods when you use lots of strips. 


B.4.1 

◎ 


Exomples .estimating |He. error 

Let’s see how these errors turn out for the example 



which we’ve looked at earlier in this appendix. First, we’ll set f(x) = e _x2 , 
and then calculate that 


f'(x) = —2xe~ x2 , f"(x) = (4x 2 — 2)e~ x2 , /( 3 ) ⑻ = —4x(2x 2 — 3)e~ c 

and /⑷⑷ = 4 (4a : 4 — 12a : 2 + 3)e~ x2 . 


Let’s find Mi first. This means that we have to find the maximum value of 
I/’($)|, which is actually on [0,2]. Since the second derivative f /f (x) 

is 0 at $ = 1 /^/ 2 , and changes sign from negative to positive there, we have 
a local minimum for f (x) at 1/ y/2. This means that the minimum value of 
f f (x) on [0,2] is —v^e -1 / 2 , so the maximum value of \ f f (x)\ is \/2e -1 / 2 . That 
is, Mi = - 1 / 2 . 

Now we can go back to our estimates for the integral from Section B.1.1 
above. There we used 10 evenly spaced strips to estimate our integral. Since 
a = 0 , 6 = 2, and h = (2 — 0)/10 = we have 

I error using 10 strips | < ^Mi (6 — a)h = ^ x V^e - " 2 (2 — 0 ) 丢 = 誓 e— " 2 . 
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This is approximately 0.171553. Note that it doesn’t matter if you use the 
left-hand endpoint, the right-hand endpoint, or somewhere in between as your 
choice of c n . (In Section B.l.l, we used the right-hand endpoints and then 
the midpoints to get two different estimates, but they are both accurate to 
about ±0.171553.) 

Let’s move on to the trapezoidal rule estimate. In Section B.2 above, we 
used 5 trapezoids of width h = 2/5 to estimate our integral (so n = 5). To see 
how big the error could be, we’ll need to find M 2 by maximizing the value of 
\f r/ {x)\ on [0,2]. To do this, look back at the above formulas for f( 2 )(x) and 
/( 3 )(a:j. The zeroes of /⑶ ⑷ which lie in [0,2] are at x = 0 and x = >/3/2, 
so these are the critical points of /( 2 )($). (Remember, the third derivative is 
the derivative of the second derivative!) So we can test the values of /"(0) 
and /"(#)， as well as the value /(2) at the other endpoint 2. We find 
/"(0) = —2, / ,/ (^/3/2) = 4e~ 3 / 2 and /’’(2) = 14e 一 4 . The largest of these, in 
absolute value, is /"(0). This means that M 2 = 2. Now we can estimate the 
error (remembering that h = 2/5): 


a)h 2 


x 2(2 - 0) 


( 誉 y 


which is 0.053333_ This is a lot less than the error using 10 strips, even 

though we only used 5 trapezoids! Since our previous estimate was approxi¬ 
mately 0.881131, we have shown that the approximation 



dx U 0.881131 


is accurate to about 士 0.053333. (This is certainly consistent with my ob¬ 
servation at the end of Section B.3 above that the correct value is actually 
0.882081, to six decimal places.) 

Finally, we’ll estimate the error using Simpson’s rule. In Section B.3 above, 
we used Simpson’s rule with n = 8 to show that 



dx^ 0.882066. 


We’ll need M4, which is the maximum value of |/( 4 )($)| on [0,2]. This could 
be very messy, since / ⑷ ($) = 4(4a ; 4 — 12a ; 2 + 3)e~ x . Let’s cheat by finding 
the maximum value of each of the three factors. No problem with 4, and e~ x 
is positive and is maximized at a: = 0 (with a value of 1 ); so we only have to 
find the maximum of \4x 4 — 12x 2 + 3| on [0,2]. We have 

-|-(4a; 4 - 12a : 2 + 3) = 16a: 3 - 24a; = 8x(2x 2 - 3), 

ax 

so the maximum we’r e looking for could only occur at one of the critical points 
x = 0 and x = y/S/2, or at the other endpoint x = 2. Plugging these numbers 
in, we find that the greatest value 19 occurs at a: = 2 , which means that 


14a: 4 - 12a ; 2 +3| < 19 
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on [0,2]. Putting everything together, we can say that 
M 4 < 4 x 19 x 1 = 76. 

(Actually, M 4 = 12, but you need to look at the fifth derivative of / to 
see this, and enough is enough!) Now we can finally use our formula (with 
ft =： (2-0)/8 = 1/4): 

I error using Simpson’s rule with n = 8| < — a)h 4 

loU 

<lx76(2-W = !. 

_ 180 v ； V 4 / 5760 

This is about 0.003299, which is much lower than the previous two errors we 
calculated. 


B.4,2 Prcaf of On error t©fm inequality 

The proofs of the last two of the three error inequalities in Section B.4 above 
are a little beyond the scope of this book, but it’s not too hard to show that 
the first one is true: 


where Mi is the maximum value of f f (x) in [a, 6]. Suppose that we use the 
left-hand endpoints for our estimate. Let’s look at just one of the strips. If its 
base is the interval [q, q-\- h] (for some q), then it looks something like this: 


y = / ⑻ 



m 

f - h —— ► 


Q-\- h 


The approximating rectangle has height f(q) and width h units, so the ap¬ 
proximate area is hf(q) square units. How bad could this be, in general? 
It all depends on how much the graph of / deviates from the constant line 
y = f(q). Here are the two worst-case scenarios: 







segment of slope —M\ beginning at 
ist be trapped between these two e? 
n y = f{q) If f(x) eve] 

[q, q + /i]), then we have f(x) > /(( 

x — q 

lean Value Theorem (see Section 11.3 in Chapt 
lual to /’(c) for some c in [q^x], so f f (c) > M\, 
is the maximum value of \ f f {x)\ on [a, b ]. A sim 
f(x) always lies above the down-sloping line. 


function mi 
aas equatio] 
the interval 
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Symbol 


R 

hM 

{a,b) 

{a,b] 

A\B 

f(x) 

/-i 
f ° g 
A 

w 

sin, cos, tan 
sec, esc, cot 


sin -1 , cos -1 , tan" 


sec -1 , esc -1 , cot' 


sinh, cosh, tanh 


sech, csch, coth 


sinh 一 i, cosh -1 , tanh -1 
sech -1 , csch -1 , coth -1 
ln(x), log e (x) 



DNE 


0/0, oo/oo, 0 x oo 
0°, 1°°, oo° 

l^H 


Meaning 

set of real numbers 

closed interval from a to 6 

open interval from a to 6 

half-open interval from a to 6 

all numbers in A not including those in B 

function / evaluated at x 

inverse function of / 

composition of / with g 

discriminant of quadratic 

absolute value of x 

basic trig functions (sine, cosine, tangent) 
reciprocal trig functions (secant, cosecant, 
cotangent) 

inverse trig functions (arcsine, arccosine, 
arctangent) 

inverse reciprocal trig functions (arcsecant, 
arccosecant, arccotangent) 
basic hyperbolic functions (hyperbolic sine, 
cosine, tangent) 

reciprocal hyperbolic functions (hyperbolic 
secant, cosecant, cotangent) 
inverse trig functions (hyperbolic arcsine, 
arccosine, arctangent) 

inverse reciprocal trig functions (hyperbolic 

arcsecant, arccosecant, arccotangent) 

natural logarithm of x 

two-sided limit as x approaches a 

right-hand limit as x approaches a (from above) 

left-hand limit as x approaches a (from below) 

limit does not exist 

indeterminate forms 

indeterminate forms 

equals, using FHopitaFs Rule 

asymptotic functions or sequences 

approximately equal to 


Page(s) 


3 

3 

3 

5 

2 

8 

12 

20 

23 

26 

27 

208-215 

216-218 

198-200 

198-200 

220-223 

222-223 

176 

42, 672 
44, 680 
44, 680 
44 

58, 293-303 

293-303 

295 

442, 488 
33 
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change in x 

derivative of / with respect to x 
second derivative of / with respect to x 
nth derivative of / with respect to x 
derivative of y with respect to x 

second derivative of y with respect to x 

displacement, velocity, acceleration 

acceleration due to gravity 

length of line segment AB 

triangle with vertices A, B, C 

base of natural logarithm 

half-life of radioactive material 

discontinuity (used in table of signs) 

linearization 

differential of / 


sum from ^* = a to 6 of ... 307 

F{b)~ F{a) 363 

definite integral of / with respect to x 326 

indefinite integral (antiderivative) of / with respect 364 
to x 

average value of / 350 

integral number n (reduction formulas) 419 

sequence cii, 奶，奶 ， ... 478, 483 

infinite series ai + <12 + CI 3 H - 483 

n factorial (1 x 2 x 3 x • • • x (n — 1) x n) 505 

ATth-order Taylor polynomial 522 

ATth-order remainder term 524 

polar coordinates 582 

595 

complex number in Cartesian form 596, 599 

complex number in polar form 600 

complex exponential of 2 ： 598 

real part of 2 ： 596 

imaginary part of 2 ： 596 

complex conjugate of 2 ： 597 

modulus of 之 597 

argument of 2 : 601 

homogeneous solution (differential equations) 657 

particular solution (differential equations) 657 


Ax 

f'(x) 

/" ⑷， / (2) ㈦ 

/(")(*) 

骞♦醇 

x, v, a 
9 

\AB\ 

AABC 


L{x) 

df 

b 




f(x) dx 


f{x) dx 


/av 

In 

{an} 

n! 

Pn(x) 

Rn{x) 

M) 


•Z = X 十 2 

m^.re ie 


Re(z) 

Im(>) 


N 

arg(x) 


Symbol 


Meaning 


Page(s) 


459947602 
10 4 4 3 4 113379488 
9 9 9 9 9 9 111111222 
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absolute convergence, 491, 516 
absolute convergence test 

for improper integrals, 447-449, 453 
for series, 490-491, 516-518 
absolute maximum, see global maximum 
absolute minimum, see global minimum 
absolute values, 23-24 

in limits, see limits, involving abso¬ 
lute values 
acceleration, 114 

constant negative, 115-117 
alternating series test, 497-499, 516-518, 
548 

antiderivatives, 361 
approximations, see estimates 
arc lengths, 637-639 

parametric formula for, 638 
polar formula for, 639 
arccos (: r), see inverse cosine 
arcsec(x), see inverse secant 
arcsin(x), see inverse sine 
arctan(x), see inverse tangent 
areas 

between curve and y-axis, 344-346 
between two curves, 342—344 
and definite integrals, 326 
and displacement, 314-318 
enclosed by polar curves, 591-593 
signed, 319-320 
unsigned, see unsigned areas 
using definite integrals to find, 339- 
346 

argument, 601 
ASTC method, 31-33 
asymptotes 

horizontal, 47 
misconceptions about, 50 
vertical, 46 


asymptotic functions, 442, 455 
asymptotic sequences, 488 
average speed, 84 
average value of functions, 350 
average velocity, 85, 350 
axis, 632 

base, 167 

bell-shaped curve, 703 
binomial theorem, 539 
blow-up points, 432, see also problem spots 
in interior, 436 
at left-hand endpoint, 433 
at right-hand endpoint, 436 
bounded functions, 431 

cardioid, 589 

Cartesian coordinates, 581, 599 

and complex numbers, see Cartesian 
form 

conversion of from polar coordinates, 
582-583 

conversion of to polar coordinates, 583- 
585 

Cartesian form, 600 

conversion of from polar form, 601 
conversion of to polar form, 601-603 
center of power series, 529 
chain rule, 107-109 

justification of, 113 
proof of, 693-694 

change of base rule (logarithms), 171 
characteristic quadratic equations, 654-656 
closed interval, 3 
codomain, 1 
coefficients 

leading, 20 
of polynomials, 19 
of power series, 527 
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coefficients (continued) 
of Taylor series, 530 
comparison test 

for improper integrals, 439-441, 455 
for series, 487-488, 510-515 
completeness, 686, 687 
completing the square, 20, 202, 402 
and trig substitutions, 426 
complex conjugate, 597 
complex numbers, 596 
adding, 596 
arguments of, 601 

Cartesian form of, see Cartesian form 

conjugates of, 597 

dividing, 596-598 

and exponentials, 598-599 

imaginary part of, 596 

modulus of, 597 

multiplying, 596 

polar form of, see polar form 

real part of, 596 

representation of on complex plane, 
599-603 

solving e z = w, 610-612 
solving z n = w, 604-610 

summary of method for, 607 
subtracting, 596 
taking large powers of, 603-604 
complex plane, 599-603 
composition of functions, 11-14 
compound interest, 173-175 
concave down, 237 
concave up, 237 
conditional convergence, 498 
conjugate expression, 61 
constant functions, derivatives of, 102 
constant multiples 

and derivatives, 103, 691 
and integrals, 373 

constant-coefficient differential equations, 
653-665 
continuity 

on an interval, 77 
at a point, 76 
continuous functions, 77 

compositions of, 684-686 
and differentiable functions, 96-97 
examples of, 77-80 
convergence 

absolute, 491, 516 
conditional, 498 
of improper integrals, 433 
of power series, 551-558 
of sequences, 478 


of series, 482 
of Taylor series, 530-534 
correction term, second order, 522, 523 
cosecant, 27 

derivative of, 143 
graph of, 38 

integrals involving powers of, 418 
inverse of, see inverse cosecant 
symmetry properties of, 38 
cosh (: r), see hyperbolic cosine 
cosine, 26 

derivative of, 142 
graph of, 36 

integrals involving powers of, 413-415 
inverse of, see inverse cosine 
symmetry properties of, 38 
see cosine 

),see inverse cosine 
cotangent, 27 

derivative of, 143 
graph of, 38 

integrals involving powers of, 418 
inverse of, see inverse cotangent 
symmetry properties of, 38 
coth(ir), see hyperbolic cotangent 
cot(x), see cotangent 
critical points, 227 

classifying using the first derivative, 
240-242 

classifying using the second deriva¬ 
tive, 242-243 

csch (: r), see hyperbolic cosecant 

csc(x), see cosecant 

cylindrical shells, see shell method 

decay constant, 196 
decreasing functions, 236 
definite integrals 
and areas, 326 
basic idea of, 325-330 
and constants, 337 
definition of, 330-334 
estimating, 346-350 
properties of, 334-338 
splitting of into two pieces, 337 
and sums and differences, 338 
degree 

of polynomial, 19 
of Taylor polynomial, 533 
derivatives, 90 

of compositions, 107 
of constant functions, 102 
of constant multiples, 103, 691 
of cos(x), 142 
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of cot(x), 143 
of csc(x), 143 

of differences, 103, 691-692 
finding using power or Taylor series, 
568-570 

higher-order, 94 

implicit, see implicit differentiation 
of inverse functions, 204-207 
involving trig functions, 141-148 
left-hand, 95 
as limiting ratios, 91-93 
of ln(x), 177-179 
of log 6 (x), 177-179 
logarithmic, 189-192 
of logarithms, 177-179 
nonexistence of, 94 
of parametric equations, 578-580 



123, 697-698 

in polar coordinates, 590-591 

of products, see product rule 

of quotients, see quotient rule 

right-hand, 95 

of sec(x), 142 

second, 94 

of sin(x), 141 

of sums, 103, 691-692 

table of signs for, 247-248 

of tan(x), 142 

third, 94 

using the definition to find, 99 
using to classify critical points, 240- 
242 

using to show inverse exists, 201-203 
of x n , 101-102 
difference of two cubes, 58 
differentiable functions, 90 

and continuous functions, 96-97 
differential, 281-282 
differential equations, 193, 645-646 
constant-coefficient, 653-665 
first order, 645 
first-order homogeneous, 654 
first-order linear, 648-653 
and initial value problems, see initial 
value problems 
and modeling, 665-667 
nonhomogeneous, 656-663 
second-order homogeneous, 654-656 
separable, 646-648 
differentiation, 90 
disc method, 619-620, 622 
discontinuity, 76 
discriminant, 20 


displacement, 85 

and areas, 314-318 
as integral of velocity, 327 
distance (integral of speed), 327 
divergence 

of improper integrals, 433 
of sequences, 478 
of series, 482 
domain, 1 

finding, 4-5 
restricting, 2, 9 
double root, 20, 595, 656 
double-angle formulas, 40, 409 
dummy variable, 43, 308, 356 

e 

definition of, 173-175 
limits involving, 181-182 
endpoints of integration, 326 
envelope, 140 
equating coefficients 

in differential equations, 658 
in partial fractions, 404 
error term 

in linearization, 281, 285-287, 696- 
697 

in Taylor series, 524, 536 

techniques for estimating, 548-550 
estimates 

of definite integrals, 346-350 
error in, 711-714 
using Simpson’s rule, 709-710 
using strips, 703-706 
using the trapezoidal rule, 706-708 
using linearization, 279-281 
using quadratics, 521-522 
using Taylor polynomials, 519-520, 
540-548 

Euler’s identity, 599, 615 
even functions, 14 
product of, 16 
symmetry of graph of, 15 
exponent, 167 

exponential decay, 193, 195-197 
equation describing, 197 
exponential growth, 193-195 
equation describing, 194 
exponential rules, 168 
exponentials 

behavior of near 0, 182-183, 472-473 
behavior of near 士 oo, 184-186, 461- 
464 

complex, 598-599 
graph of, 22 
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exponentials ( continued) 

relationship of with logarithms, 169 
theory of, 689-691 
extrema, 225 

Extreme Value Theorem, 227 
proof of, 694-695 

First Fundamental Theorem, 358-361 
proof of, 381-382 
solving problems using, 366-371 
statement of, 360 

first-order differential equations, 645 
homogeneous, 654 
linear, 648-653 
nonhomogeneous, 656-663 

form 

for partial fractions, 399 
for particular solutions, 658, 659 
functions, 1 

asymptotic, 442, 455 
average value of, 350 
based on integral, 355-358 
continuous, see continuous functions 
decreasing, 236 

differentiable, see differentiable func¬ 
tions 

even, see even functions 
exponential, see exponentials 
hyperbolic, see hyperbolic functions 
increasing, 236 
integrable, 331 
inverse, see inverse functions 
inverse hyperbolic, see inverse hyper¬ 
bolic functions 

inverse trig, see inverse trig functions 
involving absolute values, see abso- 



linear, see linear functions 
logarithm, see logarithms 
nonintegrable, 353-354 
odd, see odd functions 
poly-type, 67 

rational, see rational functions 
symmetry properties of, 14-16 
trigonometric, see trig functions 
with zero derivative, 236 
Fundamental Theorem of Algebra, 595 
Fundamental Theorem of Calculus 

First, see First Fundamental Theo¬ 
rem 

Second, see Second Fundamental The¬ 
orem 


geometric series, 484-485, 502-503 
global maximum，226 

how to find, 228-230 
global minimum, 226 

how to find, 228-230 
graphs 

of common functions, 19-24 
method for sketching, 250-252 
shifting, 13 
growth constant, 194 

half-life, 196 
half-open interval, 3 
harmonic series, 489 
homogeneous differential equations, 654 
first-order, 654 
second-order, 654-656 
homogeneous solutions, 658 

conflicts with particular solutions, 662 — 
663 

horizontal asymptotes, 47 
horizontal line test, 8-9 
hyperbolic cosecant, 199 
inverse of, 222-223 
hyperbolic cosine, 198-200 
inverse of, 220-222 
hyperbolic cotangent, 199 
inverse of, 222-223 
hyperbolic functions, 198-200 
hyperbolic geometry, 198 
hyperbolic secant, 199 
inverse of, 222-223 
hyperbolic sine, 198-200 
inverse of, 220-222 
hyperbolic tangent, 199 
inverse of, 222-223 

imaginary numbers, 596 
imaginary part, 596 
implicit differentiation, 149-154 
in optimization, 274-275 
and second derivatives, 154-156 
improper integrals, 431-476 

absolute convergence test for, see ab¬ 
solute convergence test, for im¬ 
proper integrals 

comparison test for, see comparison 
test, for improper integrals 
definition of, 432 

limit comparison test for, see limit 
comparison test, for improper in¬ 
tegrals 

and negative function values, 453-454 


geometric progressions, 480-481 
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p-test for, see p-test, for improper in¬ 
tegrals 

and series, 487-491 
splitting of, 452-453 
summary of tests for analyzing, 454- 
456 

increasing functions, 236 
indefinite integrals, 364-366 
indeterminate forms, 58, see also l’H6pital’s 
Rule 

index of summation, 308 
infimum, 230 

infinite sequences, see sequences 
infinite series, see series 
inflection points, 238-239 
initial value problems (IVP), 646, 647 
constant-coefficient linear, 663-665 
input, 1 

instantaneous velocity, see velocity 

integrable functions, 331 

integral test (for series), 494-497, 509-510 

integrals 

definite, see definite integrals 
improper, see improper integrals 
indefinite, see indefinite integrals 
integrand, 326 

integrating factor, 649, 652-653 
integration 

and partial fractions, see partial frac¬ 
tions 

involving powers of cos, 413—415 
involving powers of cot, 418 
involving powers of esc, 418 
involving powers of sec, 416-418 
involving powers of sin, 413-415 
involving powers of tan, 415-416 
involving powers of trig functions, 413- 
421 

overview of techniques of, 429-430 
by parts, 393-397 
substitution method of, 383-391 
using trig identities, 409-413 
using trig substitutions, see trig sub¬ 
stitutions 

integration by parts, 393-397 
Intermediate Value Theorem (IVT), 80-82 
proof of, 686-687 
interval notation, 3-4 
inverse cosecant 

derivative of, 218 
domain of, 217 
graph of, 217 
limits at ±oo, 218 


range of, 217 

symmetry properties of, 217 
inverse cosine 

derivative of, 212 
domain of, 212 
graph of, 211 
range of, 212 

relationship of with inverse sine, 212- 
213 

symmetry properties of, 212 
inverse cotangent 

derivative of, 218 
domain of, 217 
graph of, 217 
limits at 士 oo, 218 
range of, 217 

symmetry properties of, 217 
inverse functions, 7-8 

derivatives of, 204-207 
existence of, 201-203 
finding, 9 
inverses of, 11 

inverse hyperbolic cosecant, 222-223 
inverse hyperbolic cosine, 220-222 
inverse hyperbolic cotangent, 222-223 
inverse hyperbolic functions, 220-223 
inverse hyperbolic secant, 222-223 
inverse hyperbolic sine, 220-222 
inverse hyperbolic tangent, 222-223 
inverse secant 

derivative of, 217 
domain of, 217 
graph of, 216 
limits at 士 oo, 216 
range of, 217 

symmetry properties of, 217 
inverse sine, 208-211 
derivative of, 210 
domain of, 210 
graph of, 209 
range of, 210 

relationship of with inverse cosine, 212- 
213 

symmetry properties of, 210 
inverse tangent 

derivative of, 215 
domain of, 215 
graph of, 214 
limits at 士 oo, 215 
range of, 215 

symmetry properties of, 215 
inverse trig functions, 208-218 
computing, 218-220 
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monic quadratics, 20 

MVT, see Mean Value Theorem 

natural logarithms, 176 
Newton’s method, 287-292 
formula for, 289 

potential problems with, 290-292 
nonhomogeneous differential equations, 656- 
663 

nonintegrable functions, 353-354 
nonnegative numbers, 2 
normal distribution, 703 
nth term test, 486-487, 503-504 

odd functions, 14 
product of, 16 
symmetry of graph of, 15 
open interval, 3 
optimization, 267-278 

method for solving problems involv¬ 
ing, 269 

using implicit differentiation in, 274- 
275 

order 

of differential equations, 645 
of Taylor polynomials, 533 
output, 1 
overestimates 

in linearization, 286 
in Taylor series, 541 

p-test 

for improper integrals, 456 
for improper integrals, 444-447 
for series, 489-490, 497, 510-515 
parameters, 576 

and arc lengths, 638 
and surface areas, 643 
parametric equations, 575-578 
derivatives of, 578-580 
second derivatives of, 580-581 
parametrization, 577 
and speed, 639-640 
partial fractions, 397-408 
form for, 399 
main method of, 404 
partial sums, 482, 483 
particular solutions, 657 

conflicts with homogeneous solutions, 
662-663 

finding, 658-662 
partitions, 317, 330, 704 
evenly spaced, 705-706 
mesh of, see mesh 


parts, integration by, 393-397 
periodic, 35, 589, 601 
piecewise-defined functions, derivatives of, 
697-698 

point-slope form, 18 
points of inflection, 238-239 
polar coordinates, 581-590, 599 
and arc lengths, 639 
and complex numbers, see polar form 
conversion of from Cartesian coordi¬ 
nates, 583-585 

conversion of to Cartesian coordinates, 
582-583 

sketching curves in, 585-590 
polar curves 

areas enclosed by, 591-593 
tangents to, 590-591 
polar form, 600 

conversion of from Cartesian form, 601- 
603 

conversion of to Cartesian form, 601 
poly-type functions, 67 

behavior of near 0, 469-470 
behavior of near 士 oo, 456-459 
in limits, see limits, involving poly¬ 
type functions 
polynomials, 19 

behavior of near 0, 469-470 
behavior of near 士 oo, 456-459 
coefficients of, 19 
degree of, 19 
leading coefficient of, 20 
power series, 527-529, 615 
convergence of, 551-558 
radius of convergence of, 551-558 
using to find derivatives, 568-570 
powers of x, derivatives of, 101—102, 192- 
193 

problem spots, 436, 437, 451 
absence of, 452 
not at 0 or oo, 475-476 
product rule, 104-105 

for three variables, 112 
justification of, 111-113 
proof of, 692 
for three variables, 105 

quadrant, 28 
quadratics, 20-21 

completing the square in, 20 
and complex numbers, 598 
discriminant of, 20 
double root of, 20 
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quadratics (continued) 
monic, 20 

quotient rule, 105-106 
proof of, 693 

radar, 584 
radians, 25 

radius of convergence, 551-558 
range, 2 

finding, 5-6 
rates of change, 156 
ratio test, 492-493, 504-508 
rational functions, 21 

in limits, see limits, involving ratio¬ 
nal functions 

integrating, see partial fractions 
real part, 596 

reduction formulas, 419-421 
reference angle, 30 
related rates, 156-165 
relative maximum, see local maximum 
relative minimum, see local minimum 
remainder term, see error term, in Taylor 
series 

restricting the domain, see domain, restrict¬ 
ing 

Riemann sums, 331, 333, 355, 591, 618, 
704 

right-continuous functions, 77 
right-hand derivatives, see derivatives, right- 
hand 

right-hand limits, see limits, right-hand 
Rolle’s Theorem, 230-233 
proof of, 695 
root test, 493-494, 508 

sandwich principle, 51-54 
proof of, 678 
for sequences, 479 
secant, 27 

derivative of, 142 
graph of, 37 

integrals involving powers of, 416-418 
inverse of, see inverse secant 
symmetry properties of, 38 
sech (: r), see hyperbolic secant 
second derivatives, 94 
and graphs, 237-239 
and implicit differentiation, 154-156 
of parametric equations, 580-581 
table of signs for, 248-250 
using to classify critical points, 242 
Second Fundamental Theorem, 362-364 
solving problems using, 371-374 


statement of, 363 

second-order correction term, 522, 523 
second-order differential equations 
homogeneous, 654-656 
nonhomogeneous, 656-663 
sec(x), see secant 
sec _1 (x), see inverse secant 
separable differential equations, 646-648 
sequences, 477 

asymptotic, 488 
and functions, 478-480 
limits of, 682 
series 

absolute convergence of, 491, 516 
absolute convergence test for, see ab¬ 
solute convergence test, for se¬ 
ries 

alternating series test for, see alter¬ 
nating series test 
basic concepts for, 481-484 
comparison test for, see comparison 
test, for series 

conditional convergence of, 498 
flowchart for investigating, 501-502 
geometric, see geometric series 
harmonic, 489 

and improper integrals, 487-491 
integral test for, see integral test (for 



limit comparison test for, see limit 
comparison test, for series 
Maclaurin, see Maclaurin series 
with negative terms, 515-518 
nth term test for, see nth term test 
p-test for, see p-test, for series 
power, see power series 
ratio test for, see ratio test 
root test for, see root test 
Taylor, see Taylor series 
telescoping, 311-314 
shell method, 620-622 
sigma notation, 307-314 
signed areas, 319-320 
simple harmonic motion, 145-146 
Simpson’s rule, 709-710 
error in, 711-714 
proof of, 710-711 
sine, 26 

derivative of, 141 
graph of, 35 

important limit involving, 137-140 
integrals involving powers of, 413-415 
inverse of, see inverse sine 
symmetry properties of, 38 
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sinh (: r), see hyperbolic sine 
sin(x )， see sine 
sin _1 (x), see inverse sine 
sketching graphs, 250-266 
of derivatives, 123-126 
in polar coordinates, 585-590 
slicing, 620, 632 
small numbers, 48-49 
smoothness, 75 
solids of revolution 

surface areas of, see surface areas, of 
solids of revolution 
volumes of, see volumes, of solids of 
revolution 

speed 

average, 84 

and parametrization, 639-640 
spiral of Archimedes, 589 
squeeze principle, see sandwich principle 
standard form, of first-order differential equa¬ 
tion, 650 

strips, estimating integrals using, 703-706 
substitution (integration technique), 383- 
391 

justification of, 392-393 
surface areas of solids of revolution, 640- 
644 

parametric formula for, 643 

table of signs, 245-247 

for the derivative, 247-248 
for the second derivative, 248-250 
tangent (function), 26 
derivative of, 142 
graph of, 37 

integrals involving powers of, 415-416 
inverse of, see inverse tangent 
symmetry properties of, 38 
tangent line, 88-90, see also linearization 
finding equation of, 114 
tanh(x), see hyperbolic tangent 
tan(x), see tangent (function) 
tan -1 (a;), see inverse tangent 
Taylor approximation theorem, 522 
proof of, 700-702 
Taylor polynomials, 522, 535-536 
finding, 537-539 
Taylor series, 529-530, 535-536 
adding, 565-566 
convergence of, 530-534 
differentiating, 562-563 
dividing, 567-568 

error term in, see error term, in Tay¬ 
lor series 


finding, 537-539 

getting new from old, 558-568 

integrating, 563-565 

multiplying, 566-567 

remainder term in, see error term, in 



and substitution, 560-561 
subtracting, 565-566 
using to find derivatives, 568-570 
Taylor’s Theorem, 523—526 
telescoping series, 311-314 
third derivatives, 94 
trapezoidal rule, 706-708 
error in, 711-714 
triangle inequality, 674 
trig functions 

basic properties of, 25 
behavior of near 0, 470-471 
behavior of near 士 oo, 459-461 
derivatives involving, see derivatives, 
involving trig functions 
extending the domain of, 28-35 
graphs of, 35-38 

integrals involving powers of, 413-421 
limits involving, see limits, involving 



periodicity of, 35 
symmetry properties of, 38 
trig identities 

complementary, 39 
integration involving, 409-413 
involving double angles, 40 
involving sums and differences, 40 
Pythagorean, 39, 410 
trig substitutions, 421-429 

and completing the square, 426 
and square roots, 427-429 
summary of, 426-427 
trigonometric series, 612-615 
triple-boxed principle, 605 

unbounded region of integration, 437 
underestimates 

in linearization, 286 
in Taylor series, 541 
unit circle, 30 
unsigned areas 

and absolute values, 376-379 
finding using definite integrals, 339- 
342 

upper sums, 324, 333, 354 

velocity, 86-87, 114 
average, 85, 350 










728 • Index 


velocity (continued) 

continuous, 320-323 
and derivatives, 91 
graphical interpretation of, 87 
vertical asymptotes, 46 
vertical line test, 6-7 
volumes 

of general solids, 631-637 
of generalized cones, 632-636 
by slicing, 620, 632 
of solids of revolution, 617-631 


disc method for finding, see disc 
method 

of regions between curve and y-axis, 
623-624 

of regions between two curves, 625- 
628 

shell method for finding, see shell 
method 

whoop-di-doo, 440, 455 



